BLOGS BY TOPIC▼
BLOGS BY AUTHOR▼
BLOGS BY YEAR▼
Integration Services include a Slowly Changing Dimension transform, which is designed to make loading dimension data easier. This blog explains how it works, and discusses whether it achieves its objectives.
- Using the Slowly Changing Dimension Transform in SSIS
- Creating an SCD transform - type 0 (fixed attributes)
- Creating an SCD transform - type 1 (changing attributes)
- Creating an SCD transform - type 2 (historical attributes)
- Inferred Members in Slowly Changing Dimensions
- The disadvantages of SCD transforms (this blog)
Posted by Andy Brown on 05 July 2014
You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.
The disadvantages of SCD transforms
I'm not a fan of these transforms, but you may disagree. However, this being my blog allow me to put my side of the case first ...
Objection 1: lack of transparency
I love SSIS. I think it's excellent software, and what it does best of all is make complicated processes transparent. You could give a typical SSIS package like this to someone non-technical, and they'd be able to pick up what it's doing:

Documented packages like this are much easier to read than SQL.
Try giving the equivalent SQL statements to a non-technical person, and they won't have a clue what it's doing.
However, SCD transforms do the opposite to the rest of SSIS: they make a transparent process opaque:

Unlike nearly every other SSIS transform, you can't edit an SCD transform directly to see what it does.
I don't mind using wizards to save me time; I don't, however, like wizards which get me from A to B but don't reveal the path taken.
The other way in which SCD transforms add opacity to what's going on is by using jargon to complicate what is actually a simple underlying process, but this is more the fault of Analysis Services than SSIS.
Objection 2: speed
Here are some of the reasons that an SCD transform isn't going to be the quickest way to load dimension table data:
Reason | More details |
---|---|
Row by row lookups | For every row in the input table, a new lookup is done against the dimension table by the business key. This means that the dimension table is not cached in memory - this is not a good idea! |
Row by row updates | Whenever a change is found (for changing attributes), another update statement is run against the dimension table. Set-based update statements in SQL will run much more quickly. |
Slow inserts | When you're using historical attributes, the insert statements generated add rows one at a time, to avoid locks between inserts and updates. Again, this will not run quickly. |
You can overcome some of these problems by indexing the business key in the dimension table and by changing the insert and update tasks generated by the SCD transform wizard, but SQL is always going to load slowly changing dimension data more quickly than an SCD transform.
One sign that the SCD transform isn't the greatest thing since sliced bread: the fact that many websites recommend either third party add-ons or alternative ways to accomplish the same results.
My recommendation
If you know SQL, I think you'd be much better of writing a stored procedure to upload your SCD data, and then call this using an Execute SQL task within a package. For our historical attribute example, this could be something like this:
-- set the date we should record things changed on
DECLARE @ChangeDate date
SET @ChangeDate = GetDate()
-- update the existing people to say that
-- their grade changed on this given date
UPDATE
DimPerson
SET
ToDate = @ChangeDate
WHERE
(
SELECT COUNT(*)
FROM vwPromotions AS v
WHERE v.PersonId = DimPerson.PersonId
) > 0
-- add the new grades in
INSERT INTODimPerson(
PersonId,
FirstName,
Surname,
Grade,
FromDate,
ToDate
)
SELECT
v.PersonId,
v.FirstName,
v.Surname,
v.GradeName,
Null,
@ChangeDate
FROM
vwPromotions AS v
INNER JOIN DimPerson AS dp
ON v.PersonId = dp.PersonId
-- add all of the new people from the fact
-- "table" into the dimension table
INSERT INTODimPerson(
PersonId,
FirstName,
Surname,
Grade,
FromDate,
ToDate
)
SELECT
v.PersonId,
v.FirstName,
v.Surname,
v.GradeName,
Null,
v.PromotionDate
FROM
vwPromotions AS v
LEFT OUTER JOIN DimPerson AS dp
ON v.PersonId = dp.PersonId
WHERE
dp.PersonId is null
Note that I haven't tested this, nor have I attempted to explain it; I've included it just to show that I don't think an SCD transform does anything too scary.
Another good way to proceed would be to forget the SCD transform in SSIS, and instead do things the "hard" way by using a combination of lookup and other transforms.
- Using the Slowly Changing Dimension Transform in SSIS
- Creating an SCD transform - type 0 (fixed attributes)
- Creating an SCD transform - type 1 (changing attributes)
- Creating an SCD transform - type 2 (historical attributes)
- Inferred Members in Slowly Changing Dimensions
- The disadvantages of SCD transforms (this blog)