Overview
This is a detailed Tutorial on how to install Sqoop on Amazon EMR Cluster and import data from MySQL Database to Amazon S3 Bucket
Tutorial Video
https://www.youtube.com/watch?v=3YJwDJOyDE0
Prerequisites
Download the following files and upload them to your S3 Bucket
- Sqoop Binary - http://archive.apache.org/dist/sqoop/1.4.4/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
- MySQL JDBC Connector - http://dev.mysql.com/downloads/connector/j/5.1.html
Install-Sqoop.sh
#!/bin/bash cd /home/hadoop hadoop fs -copyToLocal s3://synerzip-sqoop-scripts/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz tar -xzf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz hadoop fs -copyToLocal s3://synerzip-sqoop-scripts/mysql-connector-java-5.1.33.tar.gz mysql-connector-java-5.1.33.tar.gz tar -xzf mysql-connector-java-5.1.33.tar.gz cp mysql-connector-java-5.1.33/mysql-connector-java-5.1.33-bin.jar sqoop-1.4.4.bin__hadoop-2.0.4-alpha/lib/
Ensure no CRLF characters are present in the file
Sqoop-Import-all.sh
#!/bin/bash cd /home/hadoop/sqoop-1.4.4.bin__hadoop-2.0.4-alpha/bin ./sqoop import --connect jdbc:mysql://db.c5zzejm1gdnx.us-west-1.rds.amazonaws.com/test --username root --password password --table User_Profile --target-dir s3://synerzip-imported-data/User_Profile-`date +"%m-%d-%y_%T"`
Ensure no CRLF characters are present in the file