SQL Server Integration Services - An Updated HDFS Task
This is an update to my previous post about loading data into HDFS. After using the component a few times I realized that having to pass in a list of files to upload seemed a bit odd inside of a Data Flow task.
So instead I have changed the component to be an SSIS Task, instead of a Destination. This means that it is used in the Control Flow of a package, instead of in a Data Flow task.
I have also made a few other changes:
- Added a file type filter
- Improved the UI design
- Added the ability to create a new HDFS Connection Manager
- Added a UI to the HDFS Connection Manager
This is what the component now looks like:
The File Type Filter allows you to specify what types of files should be uploaded from the source directory specified. This is useful if you have a mixture of files or only want to upload a subset.
The update has been pushed to GitHub - https://github.com/kzhen/SSISHDFS.