SSIS Integration Services | Basic data transforms exercise | Create a random playlist from an Excel workbook of songs

This exercise is provided to allow potential course delegates to choose the correct Wise Owl Microsoft training course, and may not be reproduced in whole or in part in any format without the prior written consent of Wise Owl.

You can learn how to do this exercise if you attend the course listed below!

Software ==> SSIS Integration Services  (46 exercises)
Version ==> SSIS 2012 and later
Topic ==> Basic data transforms  (1 exercise)
Level ==> Average difficulty
Course ==> Introduction to SSIS
Before you can do this exercise, you'll need to download and unzip this file (if you have any problems doing this, click here for help).

You need a minimum screen resolution of about 700 pixels width to see our exercises. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

The aim of this exercise is to create a playlist of about 5 songs, although by a fairly roundabout route!  First create a package called Playlist, and within this a data flow task which includes the following:

What What it does
Excel source Import the 123 songs from the Excel workbook in the folder above.
Multicast transform Duplicate this list of songs.
Row sampling transform Sample 3 rows from the left-hand list.
Percentage sampling transform Sample 1% of the rows from the right-hand list
Union all transform Combine the two paths into one
Sort transform Sort the songs so the one with the highest sales comes first
Union all transform Add this, and put a data viewer on the data pipe leading to it so you can see which songs you've chosen!

All of this is much easier to understand with a flow chart - for which read on!

Here's what your final package could look like, if you didn't give any of the transforms nice names:

Flow chart

A possible final flow chart for your package.

Here's what your playlist should look like (your songs will obviously be different):

Playlist

A mix of styles here ...

Generate a couple of playlists, then close your package down.

You can unzip this file to see the answers to this exercise, although please remember this is for your personal use only.
This page has 1 thread Add post
11 Apr 20 at 22:47

Hello, 

I don't understand why when I run my package I get a diffferent amount of songs (sometimes 3, other times 5/6). I do see that sometimes the percentage sampling does not output rows. Can anyone give me a bit more clarity please. Thank you.

13 Apr 20 at 09:53

Good question.  

The sampling transforms are non-blocking: that is, for each row they take as input they decide whether to accept or reject it, assigning a probablility to each path.  So if (for example) you say you want to take 10% of rows, then for each row there is a 1-in-10 chance it will be accepted.

The consequence of this is that each time you run the task you can end up with a different number of rows output from it.  The alternative would be for the transform to be made fully blocking: that is, it would have to read in all of the rows, then ensure that it selected exaclty the proportion you wanted.  This would be more predictable, but would run much more slowly.

13 Apr 20 at 20:40

Oh I see, thank you for clarifying.