WISE OWL EXERCISES
SSIS INTEGRATION SERVICES EXERCISES
- Data flow tasks (4)
- Basic data transforms (1)
- Data conversion transforms (2)
- Conditional split transforms (5)
- Lookup transforms (4)
- Looping over files (3)
- Looping over rows (2)
- Merge joins (1)
- Previous versions (18)
- Script tasks (1)
- Variables in script (1)
- Script components (2)
- Accessing file attributes (2)
SSIS Integration Services | Basic data transforms exercise | Create a random playlist from an Excel workbook of songs
This exercise is provided to allow potential course delegates to choose the correct Wise Owl Microsoft training course, and may not be reproduced in whole or in part in any format without the prior written consent of Wise Owl.
You can learn how to do this exercise if you attend the course listed below!
You need a minimum screen resolution of about 700 pixels width to see our exercises. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.
The aim of this exercise is to create a playlist of about 5 songs, although by a fairly roundabout route! First create a package called Playlist, and within this a data flow task which includes the following:
|What||What it does|
|Excel source||Import the 123 songs from the Excel workbook in the folder above.|
|Multicast transform||Duplicate this list of songs.|
|Row sampling transform||Sample 3 rows from the left-hand list.|
|Percentage sampling transform||Sample 1% of the rows from the right-hand list|
|Union all transform||Combine the two paths into one|
|Sort transform||Sort the songs so the one with the highest sales comes first|
|Union all transform||Add this, and put a data viewer on the data pipe leading to it so you can see which songs you've chosen!|
All of this is much easier to understand with a flow chart - for which read on!
Here's what your final package could look like, if you didn't give any of the transforms nice names:
A possible final flow chart for your package.
Here's what your playlist should look like (your songs will obviously be different):
A mix of styles here ...
Generate a couple of playlists, then close your package down.
I don't understand why when I run my package I get a diffferent amount of songs (sometimes 3, other times 5/6). I do see that sometimes the percentage sampling does not output rows. Can anyone give me a bit more clarity please. Thank you.
The sampling transforms are non-blocking: that is, for each row they take as input they decide whether to accept or reject it, assigning a probablility to each path. So if (for example) you say you want to take 10% of rows, then for each row there is a 1-in-10 chance it will be accepted.
The consequence of this is that each time you run the task you can end up with a different number of rows output from it. The alternative would be for the transform to be made fully blocking: that is, it would have to read in all of the rows, then ensure that it selected exaclty the proportion you wanted. This would be more predictable, but would run much more slowly.