What does the SCD transform in SSIS do, and should you use it?
Part two of a six-part series of blogs

Integration Services include a Slowly Changing Dimension transform, which is designed to make loading dimension data easier. This blog explains how it works, and discusses whether it achieves its objectives.

  1. Using the Slowly Changing Dimension Transform in SSIS
  2. Creating an SCD transform - type 0 (fixed attributes) (this blog)
  3. Creating an SCD transform - type 1 (changing attributes)
  4. Creating an SCD transform - type 2 (historical attributes)
  5. Inferred Members in Slowly Changing Dimensions
  6. The disadvantages of SCD transforms

Posted by Andy Brown on 05 July 2014

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Creating an SCD transform - type 0 (fixed attributes)

This part of the blog shows how to create a type 0 slowly changing dimension transform, in which dimension data is not updated but new records can be loaded.

This blog assumes that you're comfortable creating packages in SSIS.

Creating the SCD transform

To start:

  1. create a new package;
  2. within this create a data flow task; and
  3. within this create an OLEDB source to take data from the view we've created.

The result should look something like this:

The OLEDB source

The task takes its data from the view we created earlier.

 

Here's the relevant bit of the dialog box for creating the task, if you're feeling unsure:

Setting the OLEDB connection and view

Setting the OLEDB connection and view.

Now add an SCD transform:

Adding an SCD transform

I'm using SSIS 2012, but I believe that this would work the same way in 2008 R2 and even 2005.

 

Connect the tasks up, then edit the SCD task:

Connecting tasks

Connect the two tasks.

 

You might be tempted to tick the box below:

The SCD wizard

Tick the Don't show this page again box to avoid seeing this first step any more times, then click on the button shown to go to the next stage of the wizard.

 

Configuring the SCD transform 

The first thing to do is to say which dimension we're trying to load:

Choose the SCD table

We're going to load the DimPerson table.

 

Because I haven't been consistent with my column names, I'll now need to change one of the input columns:

Choose an input column

The Grade column is called GradeName in the source table.

 

You must now set one business key:

Setting a business key

Tell SSIS which is the column to use to match rows.

Note that the PersonId column can not be a primary key in the dimension table, since records for a person could potentially appear several times.  The safest and easiest thing to do is to create a separate identity column in the dimension table DimPerson, as we've done (it was called SurrogateKeyId).  This is called a "surrogate key" (hence the column name!).

On the next screen of the wizard, specify that the column containing values which may be different is the Grade one:

Setting the dimension column

We're not interested in changes to the first name or surname - just the grade.  We'll examine changing and historical attributes later in this tutorial.

It's probably an idea to untick the box on the next stage of the wizard:

Don't fail the transformation

If you leave this ticked (as it is by default), the moment you find a changed grade for an employee the transform will fail.

Untick the inferred members box in the next stage of the wizard (I'll cover this separately later in this blog):

Unticking enabled inferred member support

For the moment, we won't enable inferred member support - or even wonder what it is!

 

That's it!  After confirming your choices on the final step of the wizard, you can now see your results:

The results of the wizard

The wizard has created one additional task.

 

Personally I don't like the fact that the wizard is such an opaque task: you can't look at it to see what it does, and can only edit it by going back into the wizard.  More on this towards the end of this blog.

Running the SCD transform 

If you run the package you've created, it will look like this:

The package when run

5 rows come into the SCD transform from the transactional database view; one row goes into the DimPerson dimension table, representing Wally Owl, since no match was found for person number 4.

 

Note that if we'd left the fail this transformation box ticked,  the SCD task would have failed on the first record.

Conclusion 

If you can't see the point of a transform which doesn't actually change any data, you're not alone!  Let's move on to the rather more useful type 1 SCD, when we actually update the dimension table data.

Why do SCDs sound so like diabetes (type 1 and type 2 being the main ones used)?  And why does SCD sound so unpleasantly similar to STD?  Perhaps it's just me ...

This blog has 0 threads Add post