https://www.gravatar.com/avatar/0973d96861ab899609382b18376c87ce?s=240&d=mp

Unravelled Development

SQL Server Integration Services - An Updated HDFS Task

This is an update to my previous post about loading data into HDFS. After using the component a few times I realized that having to pass in a list of files to upload seemed a bit odd inside of a Data Flow task. So instead I have changed the component to be an SSIS Task, instead of a Destination. This means that it is used in the Control Flow of a package, instead of in a Data Flow task.

Analysing Australian Parliamentarians' Expenditure with PowerQuery

Currently in Australia there has been some news reports of Federal MPs abusing their expenses…. So I thought I would take a look at what publicly available data there is and then use the Microsoft Excel Power* (PowerQuery, PowerPivot, PowerView) tools to do some analysis. This post is about using PowerQuery to pull data from the Department of Finance website into Excel and get it into shape to be used with PowerPivot.

SQLRally Nordic - What I'm Looking Forward To

This year I have been fortunate enough to have been able to attend a few different SQL Server community events. Back in May I attended SQLBits XI, which was a fantastic experience and opened my eyes to the #SQLFamily. I learnt a lot about SQL Server, but I think what I really learnt is that there is an amazing online community for SQL Server professionals. Then a couple of weeks ago I was at #sqlsat228 in Cambridge, where I attended a Friday pre-con on Big Data with Jen Stirrup and Allan Mitchell followed by a full day of sessions on the Saturday.

Load Files into HDFS using SQL Server Integration Services

UPDATE: I’ve made a few changes to how the component works - Read about it here. Recently I have been playing around with the Hortonworks Big Data sandbox, the tutorials were very good, and made it easy to get going with some sample data. Given that I mainly use the Microsoft BI stack I was hoping for the ability to use SSIS to load my data, especially as it would be nice down the line to do this as part of an ETL process.

Microsoft Fakes and TeamCity (and XUnit)

This post is a note to my future self on how to configure a TeamCity build to run tests that use Microsoft Fakes. If you haven’t ever come across Microsoft Fakes then take a look at this post - Testing the untestable with Microsoft Fakes (http://msmvps.com/blogs/bsonnino/archive/2013/08/11/testing-the-untestable-with-microsoft-fakes.aspx) for a good introduction. Setting up the Build Agent You will need to install Visual Studio 2012 and make sure that Update 3 is applied, no additional installation is required as Microsoft Fakes comes bundled.

Getting Started with the Hortonworks Sandbox

In my previous post, I made reference to the Twitter Big Data example for Microsoft StreamInsight (project page). The sample collects tweets in real-time from Twitter then does a few things: Displays current information about trends in real-time on a web dashboard Stores information about the tweets into a SQL Azure database Store the actual tweets into an Azure Blob Store Then there are some commands that you can use with Azure HDInsight to do some post-processing, this is great if you have access to the HDInsight Preview, but what if you are stuck on the waiting list?