562 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owls only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
What does the SCD transform in SSIS do, and should you use it?
Part one of a six-part series of blogs
Integration Services include a Slowly Changing Dimension transform, which is designed to make loading dimension data easier. This blog explains how it works, and discusses whether it achieves its objectives.
This blog shows how you can use the Slowly Changing Dimension transform in SSIS (hereafter mercifully abbreviated to SCD) to load data into a dimension.
If you don't use cubes or Analysis Services, you can almost certainly stop reading now and spare yourself a lot of pain!
One of my pet hates is training courses or blogs which use the AdventureWorks database as a source of example data, as I think examples should be easy to understand. With that in mind, let's create a sample relational database, and then a dimension which we want to load.
To use the data shown in this example, run the following script in SQL:
-- create a new database to hold tables
CREATE DATABASE WiseOwlScdBlog
-- use this database
-- create the table of possible grades
CREATE TABLE tblGrade(
GradeId int IDENTITY(1,1) PRIMARY KEY NOT NULL,
GradeName varchar(50) NOT NULL
-- create the table of people
CREATE TABLE tblPerson(
PersonId int IDENTITY(1,1) PRIMARY KEY NOT NULL,
FirstName nvarchar(255) NULL,
Surname nvarchar(255) NULL
-- create the table of links between them (who's on which grade)
CREATE TABLE tblGradePerson(
GradePersonId int IDENTITY(1,1) PRIMARY KEY NOT NULL,
GradeId int NOT NULL,
PersonId int NOT NULL,
PromotionDate date NOT NULL
-- allow overwriting of identity primary key values
SET IDENTITY_INSERT tblPerson ON
-- add 3 good and true people,and someone to manage them
INSERT tblPerson (PersonId, FirstName, Surname) VALUES (1, N'Rita', N'Smith')
INSERT tblPerson (PersonId, FirstName, Surname) VALUES (2, N'Bob', N'Brown')
INSERT tblPerson (PersonId, FirstName, Surname) VALUES (3, N'Sue', N'Jones')
INSERT tblPerson (PersonId, FirstName, Surname) VALUES (4, N'Wally', N'Owl')
SET IDENTITY_INSERT tblPerson OFF
SET IDENTITY_INSERT tblGrade ON
-- add grades minions can aspire to
INSERT tblGrade (GradeId, GradeName) VALUES (1, N'Office junior')
INSERT tblGrade (GradeId, GradeName) VALUES (2, N'Associate')
INSERT tblGrade (GradeId, GradeName) VALUES (3, N'Senior Associate')
INSERT tblGrade (GradeId, GradeName) VALUES (4, N'Manager')
INSERT tblGrade (GradeId, GradeName) VALUES (5, N'Director')
-- TAIAP ==> "To All Intents And Purposes"
INSERT tblGrade (GradeId, GradeName) VALUES (6, N'TAIAP God')
-- return to normal primary key behaviour
SET IDENTITY_INSERT tblGrade OFF
This should create 3 tables. The two populated ones will contain a list of 3 employees of a fictitious company, and a list of salary grades that they might aspire to:
|Our 4 people||Our 6 possible grades|
The third table contains a table linking these two together, which I'm now going to ask you to populate:
Right-click on the table called tblGradePerson table to edit its contents. Don't forget to refresh your databases and then tables first!
I'm using SQL Server 2012 for this blog, but everything will work just as well in SQL Server 2005, 2008, 2008 R2 or 2014!
Now just type in a few promotions:
Type in the four promotions above.
These represent the following promotions (clearly Rita is a bit of a star performer, and the company has a strange policy of promoting people on meaningful dates only):
|Rita||Associate||New Year's Day, 2014|
|Sue||Associate||New Year's Day, 2014|
|Bob||Associate||New Year's Day, 2014|
|Rita again||Senior Associate||April Fool's Day, 2014|
|Wally||Manager||Christmas Day, 2013|
We'll want to be able to combine the tables we've created, so run the following script to create a view:
-- create a view linking the 3 transactional tables
CREATE VIEW vwPromotions
tblPerson AS p
INNER JOIN tblGradePerson AS gp
ON p.PersonId = gp.PersonId
INNER JOIN tblGrade AS g
ON g.GradeId = gp.GradeId
-- test this view
SELECT * FROM vwPromotions
When you run this, it will create a view called vwPromotions, and display the results of running this on screen:
The 5 promotions which have taken place.
We now need a dimension to load, so run the following script:
-- create the dimension table
CREATE TABLE dbo.DimPerson(
SurrogateKeyId int IDENTITY(1,1) NOT NULL,
PersonId int NOT NULL,
FirstName nvarchar(255) NULL,
Surname nvarchar(255) NULL,
Grade varchar(50) NULL
-- add in some existing data
SET IDENTITY_INSERT dbo.DimPerson ON
INSERT dbo.DimPerson (SurrogateKeyId, PersonId, FirstName, Surname, Grade) VALUES (1, 1, N'Rita', N'Smith', N'Office junior')
INSERT dbo.DimPerson (SurrogateKeyId, PersonId, FirstName, Surname, Grade) VALUES (2, 2, N'Bob', N'Brown', N'Office junior')
INSERT dbo.DimPerson (SurrogateKeyId, PersonId, FirstName, Surname, Grade) VALUES (3, 3, N'Sue', N'Jones', N'Office junior')
SET IDENTITY_INSERT dbo.DimPerson OFF
Excellent! We now have a dimension table:
What we now want to do is to reload this table from our latest transactional data.
If you're worried about the fact that Wally Owl is nowhere to be seen, don't be - he'll be brought in as an inferred row.
Time to get to the point of this blog, and create our SCD transform!
|Parts of this blog|
25 Aytoun Street