560 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owls only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
What does the SCD transform in SSIS do, and should you use it?
Part two of a six-part series of blogs
Integration Services include a Slowly Changing Dimension transform, which is designed to make loading dimension data easier. This blog explains how it works, and discusses whether it achieves its objectives.
This part of the blog shows how to create a type 0 slowly changing dimension transform, in which dimension data is not updated but new records can be loaded.
This blog assumes that you're comfortable creating packages in SSIS.
The result should look something like this:
The task takes its data from the view we created earlier.
Here's the relevant bit of the dialog box for creating the task, if you're feeling unsure:
Setting the OLEDB connection and view.
Now add an SCD transform:
I'm using SSIS 2012, but I believe that this would work the same way in 2008 R2 and even 2005.
Connect the tasks up, then edit the SCD task:
Connect the two tasks.
You might be tempted to tick the box below:
Tick the Don't show this page again box to avoid seeing this first step any more times, then click on the button shown to go to the next stage of the wizard.
The first thing to do is to say which dimension we're trying to load:
We're going to load the DimPerson table.
Because I haven't been consistent with my column names, I'll now need to change one of the input columns:
The Grade column is called GradeName in the source table.
You must now set one business key:
Tell SSIS which is the column to use to match rows.
Note that the PersonId column can not be a primary key in the dimension table, since records for a person could potentially appear several times. The safest and easiest thing to do is to create a separate identity column in the dimension table DimPerson, as we've done (it was called SurrogateKeyId). This is called a "surrogate key" (hence the column name!).
On the next screen of the wizard, specify that the column containing values which may be different is the Grade one:
We're not interested in changes to the first name or surname - just the grade. We'll examine changing and historical attributes later in this tutorial.
It's probably an idea to untick the box on the next stage of the wizard:
If you leave this ticked (as it is by default), the moment you find a changed grade for an employee the transform will fail.
Untick the inferred members box in the next stage of the wizard (I'll cover this separately later in this blog):
For the moment, we won't enable inferred member support - or even wonder what it is!
That's it! After confirming your choices on the final step of the wizard, you can now see your results:
The wizard has created one additional task.
Personally I don't like the fact that the wizard is such an opaque task: you can't look at it to see what it does, and can only edit it by going back into the wizard. More on this towards the end of this blog.
If you run the package you've created, it will look like this:
5 rows come into the SCD transform from the transactional database view; one row goes into the DimPerson dimension table, representing Wally Owl, since no match was found for person number 4.
Note that if we'd left the fail this transformation box ticked, the SCD task would have failed on the first record.
If you can't see the point of a transform which doesn't actually change any data, you're not alone! Let's move on to the rather more useful type 1 SCD, when we actually update the dimension table data.
Why do SCDs sound so like diabetes (type 1 and type 2 being the main ones used)? And why does SCD sound so unpleasantly similar to STD? Perhaps it's just me ...
|Parts of this blog|
25 Aytoun Street