sqoop merge data-sets imported via incremental import using Sqoop

Help us to keep this website almost Ad Free! It takes only 10 seconds of your time:
> Step 1: Go view our video on YouTube: EF Core Bulk Insert
> Step 2: And Like the video. BONUS: You can also share it!


Sqoop incremental import comes into picture because of a phenomenon called CDC i.e. Change Data Capture. Now what is CDC?

CDC is a design pattern that captures individual data changes instead of dealing with the entire data. Instead of dumping our entire database, using CDC, we could capture just the data changes made to the master database.

For example : If we are dealing with a data problem, say, 1 lakh data entries coming into the RDBMS daily and we have to get this data in Hadoop on a daily basis then we would want to just get the newly added data, as importing the complete RDBMS data daily to Hadoop will be an overhead and delays the availability of data also. For a detailed explanation go through this link.

Got any sqoop Question?