sqoopSqoop Import


Syntax

  • <rdbms-jdbc-url> // RDBMS JDBC URL
  • <username> // Username of the RDBMS database
  • <password> // Password of the RDBMS database
  • <table-name> // RDBMS database table
  • <hdfs-home-dir> // HDFS home directory
  • <condition> // Condition that can be expressed in the form of a SQL query with a WHERE clause.
  • <sql-query> // SQL Query
  • <target-dir> // HDFS Target Directory

Remarks

Sqoop is a Hadoop Command Line tool that imports table from an RDBMS data source to HDFS and vice versa. It generates a Java class which allows us to interact with the imported data. Each row from a table is saved as a separate record in HDFS. Records can be stored as text files or in binary representation as Avro or Sequence Files. There are 2 versions of sqoop :

Sqoop1 and Sqoop2

Sqoop1 is the widely accepted tool and is recommended for production environments. Find the comparison between Sqoop1 and Sqoop2 as stated on Cloudera's website.