Contents

SQL Server Integration Services - An Updated HDFS Task

Contents

This is an update to my previous post about loading data into HDFS. After using the component a few times I realized that having to pass in a list of files to upload seemed a bit odd inside of a Data Flow task.

So instead I have changed the component to be an SSIS Task, instead of a Destination. This means that it is used in the Control Flow of a package, instead of in a Data Flow task.

I have also made a few other changes:

  • Added a file type filter
  • Improved the UI design
  • Added the ability to create a new HDFS Connection Manager
  • Added a UI to the HDFS Connection Manager

This is what the component now looks like:

/sql-server-integration-services-an-updated-hdfs-task/images/hdfs-task-ui.png

The File Type Filter allows you to specify what types of files should be uploaded from the source directory specified. This is useful if you have a mixture of files or only want to upload a subset.

The update has been pushed to GitHub - https://github.com/kzhen/SSISHDFS.