What does the SCD transform in SSIS do, and should you use it?
Part one of a six-part series of blogs

Integration Services include a Slowly Changing Dimension transform, which is designed to make loading dimension data easier. This blog explains how it works, and discusses whether it achieves its objectives.

  1. Using the Slowly Changing Dimension Transform in SSIS (this blog)
  2. Creating an SCD transform - type 0 (fixed attributes)
  3. Creating an SCD transform - type 1 (changing attributes)
  4. Creating an SCD transform - type 2 (historical attributes)
  5. Inferred Members in Slowly Changing Dimensions
  6. The disadvantages of SCD transforms

Posted by Andy Brown on 05 July 2014

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Using the Slowly Changing Dimension Transform in SSIS

This blog shows how you can use the Slowly Changing Dimension transform in SSIS (hereafter mercifully abbreviated to SCD) to load data into a dimension.

If you don't use cubes or Analysis Services, you can almost certainly stop reading now and spare yourself a lot of pain!

One of my pet hates is training courses or blogs which use the AdventureWorks database as a source of example data, as I think examples should be easy to understand.  With that in mind, let's create a sample relational database, and then a dimension which we want to load.

Our example relational database

To use the data shown in this example, run the following script in SQL:

-- create a new database to hold tables

CREATE DATABASE WiseOwlScdBlog

GO

-- use this database

USE WiseOwlScdBlog

GO

-- create the table of possible grades

CREATE TABLE tblGrade(

GradeId int IDENTITY(1,1) PRIMARY KEY NOT NULL,

GradeName varchar(50) NOT NULL

)

-- create the table of people

CREATE TABLE tblPerson(

PersonId int IDENTITY(1,1) PRIMARY KEY NOT NULL,

FirstName nvarchar(255) NULL,

Surname nvarchar(255) NULL

)

-- create the table of links between them (who's on which grade)

CREATE TABLE tblGradePerson(

GradePersonId int IDENTITY(1,1) PRIMARY KEY NOT NULL,

GradeId int NOT NULL,

PersonId int NOT NULL,

PromotionDate date NOT NULL

)

-- allow overwriting of identity primary key values

SET IDENTITY_INSERT tblPerson ON

-- add 3 good and true people,and someone to manage them

INSERT tblPerson (PersonId, FirstName, Surname) VALUES (1, N'Rita', N'Smith')

INSERT tblPerson (PersonId, FirstName, Surname) VALUES (2, N'Bob', N'Brown')

INSERT tblPerson (PersonId, FirstName, Surname) VALUES (3, N'Sue', N'Jones')

INSERT tblPerson (PersonId, FirstName, Surname) VALUES (4, N'Wally', N'Owl')

SET IDENTITY_INSERT tblPerson OFF

SET IDENTITY_INSERT tblGrade ON

-- add grades minions can aspire to

INSERT tblGrade (GradeId, GradeName) VALUES (1, N'Office junior')

INSERT tblGrade (GradeId, GradeName) VALUES (2, N'Associate')

INSERT tblGrade (GradeId, GradeName) VALUES (3, N'Senior Associate')

INSERT tblGrade (GradeId, GradeName) VALUES (4, N'Manager')

INSERT tblGrade (GradeId, GradeName) VALUES (5, N'Director')

-- TAIAP ==> "To All Intents And Purposes"

INSERT tblGrade (GradeId, GradeName) VALUES (6, N'TAIAP God')

-- return to normal primary key behaviour

SET IDENTITY_INSERT tblGrade OFF

GO

This should create 3 tables.  The two populated ones will contain a list of 3 employees of a fictitious company, and a list of salary grades that they might aspire to:

Table of people Table of grades
Our 4 people Our 6 possible grades

The third table contains a table linking these two together, which I'm now going to ask you to populate:

Editing the tblGradePerson table

Right-click on the table called tblGradePerson table to edit its contents.  Don't forget to refresh your databases and then tables first!

 

I'm using SQL Server 2012 for this blog, but everything will work just as well in SQL Server 2005, 2008, 2008 R2 or 2014! 

Now just type in a few promotions:

GradePerson table

Type in the four promotions above.

These represent the following promotions (clearly Rita is a bit of a star performer, and the company has a strange policy of promoting people on meaningful dates only):

Person Grade Date
Rita Associate New Year's Day, 2014
Sue Associate New Year's Day, 2014
Bob Associate New Year's Day, 2014
Rita again Senior Associate April Fool's Day, 2014
Wally Manager Christmas Day, 2013

Creating a view to combine tables

We'll want to be able to combine the tables we've created, so run the following script to create a view:

USE [WiseOwlScdBlog]

GO

-- create a view linking the 3 transactional tables

CREATE VIEW vwPromotions

AS

SELECT

p.PersonId,

p.FirstName,

p.Surname,

g.GradeName,

gp.PromotionDate

FROM

tblPerson AS p

INNER JOIN tblGradePerson AS gp

ON p.PersonId = gp.PersonId

INNER JOIN tblGrade AS g

ON g.GradeId = gp.GradeId

GO

-- test this view

SELECT * FROM vwPromotions

When you run this, it will create a view called vwPromotions, and display the results of running this on screen:

The promotions view

The 5 promotions which have taken place.

Our example dimension

We now need a dimension to load, so run the following script:

USE WiseOwlScdBlog

GO

-- create the dimension table

CREATE TABLE dbo.DimPerson(

SurrogateKeyId int IDENTITY(1,1) NOT NULL,

PersonId int NOT NULL,

FirstName nvarchar(255) NULL,

Surname nvarchar(255) NULL,

Grade varchar(50) NULL

)

-- add in some existing data

SET IDENTITY_INSERT dbo.DimPerson ON

INSERT dbo.DimPerson (SurrogateKeyId, PersonId, FirstName, Surname, Grade) VALUES (1, 1, N'Rita', N'Smith', N'Office junior')

INSERT dbo.DimPerson (SurrogateKeyId, PersonId, FirstName, Surname, Grade) VALUES (2, 2, N'Bob', N'Brown', N'Office junior')

INSERT dbo.DimPerson (SurrogateKeyId, PersonId, FirstName, Surname, Grade) VALUES (3, 3, N'Sue', N'Jones', N'Office junior')

SET IDENTITY_INSERT dbo.DimPerson OFF

Excellent!  We now have a dimension table:

Dimension table

What we now want to do is to reload this table from our latest transactional data.

If you're worried about the fact that Wally Owl is nowhere to be seen, don't be - he'll be brought in as an inferred row.

 

Time to get to the point of this blog, and create our SCD transform!

This blog has 0 threads Add post