Overview
This is a detailed Tutorial on how to install Sqoop on Amazon EMR Cluster and import data from MySQL Database to Amazon S3 Bucket
Tutorial Video
https://www.youtube.com/watch?v=3YJwDJOyDE0
Prerequisites
Download the following files and upload them to your S3 Bucket
- Sqoop Binary - http://archive.apache.org/dist/sqoop/1.4.4/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
- MySQL JDBC Connector - http://dev.mysql.com/downloads/connector/j/5.1.html
Install-Sqoop.sh
#!/bin/bash
cd /home/hadoop
hadoop fs -copyToLocal s3://synerzip-sqoop-scripts/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
tar -xzf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
hadoop fs -copyToLocal s3://synerzip-sqoop-scripts/mysql-connector-java-5.1.33.tar.gz mysql-connector-java-5.1.33.tar.gz
tar -xzf mysql-connector-java-5.1.33.tar.gz
cp mysql-connector-java-5.1.33/mysql-connector-java-5.1.33-bin.jar sqoop-1.4.4.bin__hadoop-2.0.4-alpha/lib/
Ensure no CRLF characters are present in the file
Sqoop-Import-all.sh
#!/bin/bash
cd /home/hadoop/sqoop-1.4.4.bin__hadoop-2.0.4-alpha/bin
./sqoop import --connect jdbc:mysql://db.c5zzejm1gdnx.us-west-1.rds.amazonaws.com/test --username root --password password
--table User_Profile --target-dir s3://synerzip-imported-data/User_Profile-`date +"%m-%d-%y_%T"`
Ensure no CRLF characters are present in the file
Steps
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Step 8
Step 9
Step 10
Step 11
Step 12
Step 13
Step 14