How to loop over the worksheets in an Excel workbook using Integration Services
Follow this blog to learn how to use an SSIS package to loop over the worksheets in an Excel XLSX workbook, importing the contents of each.

Posted by Andy Brown on 21 April 2017

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Use SSIS to import all of the worksheets from an Excel file

This blog shows how to loop over the worksheets in an Excel workbook, importing the contents of each.  The technique will only work if the worksheets are all in the same format.

You can follow this blog by downloading and unzipping this file.  This blog was written using SQL Server 2016 and Excel 2016 (for earlier versions of software see my previous blog).

Our example

We want to consolidate three separate months of shopping data into a single SQL Server table:

Three months of shopping data

Each month contains a separate list of shopping trips, but (crucially) they share the same format.

 

Here's what the end result should be:

The final table

What the final answer should look like (although the rows won't necessarily be in this order).

 

If you want to work through the case study in this blog, first run this script in Management Studio to create the table you'll need:

-- create table for shopping data

CREATE TABLE AdoShopping(

ShopName nvarchar(255),

ShopDate date,

Spending decimal(10,2)

)

GO

SELECT * FROM AdoShopping

That's enough about what we need to do.  Time now to do it!

The underlying principle

We need to get a list of all of the names of the worksheets in our workbook:

List of worksheets

We'll never see this, but this is the collection of names that we need to loop over.

  

We'll accomplish the above using a Foreach ADO.NET Schema Rowset Enumerator loop.  You'd normally use this to loop over tables, views or stored procedures in a SQL Server database, but it can also be used - as we'll see - to loop over worksheet names.

Step 1 - Create a variable to hold each worksheet's name

The first thing to do is to create a variable to hold each worksheet's name in turn:

The name of each worksheet

It's crucial that you give a value to this variable equal to the name of one of the worksheets that you want to import, otherwise you won't be able to get the data flow task working later.

Step 2 - Add the basic package tasks

Create a package which deletes any old rows from the AdoShopping table, then has an empty loop and an empty data flow task:

The skeleton package

The Execute SQL task will run the command TRUNCATE TABLE AdoShopping (if you don't know how to set this up, you're probably reading the wrong blog).  See below for how to configure the other tasks.

Step 3 - Creating a connection for the Foreach Loop

Double-click on the Foreach Loop task to edit it, and say what sort of loop it is:

Choose loop type

On the Collection tab, specify that this loop should be an ADO.NET Schema Rowset Enumerator (!).

Now choose to create a new connection:

Creating a connection

Choose to create a new connection

Ask to create a new connection:

Click on NEW button

Click to create a new connection.

 

Choose the provider that you want to use:

Access 12.0 provider

It may seem strange that you use an Access provider to connect to an Excel worksheet, but ...

 

Now type in the Excel workbook to which you want to link:

Excel workbook name

Make sure that you include the full file path and file name (obviously your path will be different to the one shown here).

 

On the All tab, set the Extended Properties to Excel 12.0;HDR-YES;:

Type extended properties

Type this in carefully!

HDR=YES tells SSIS that the first row of your Excel worksheet contains column headers, not data, but the more important information is the internal Excel version number.

Test your connection:

Testing connection

Click on this button at the bottom left to test your connection.

You should get this message:

Connection succeeded

Yeah!  Or if you get an error, check you haven't left the workbook open in Excel.

 

Select OK until you exit the dialog box for configuring your connection, then optionally rename the connection:

Renaming the connection

Rename the connection that you've created for your package.

Step 4 - Configuring the rest of the Foreach Loop

Double-click on your Foreach Loop to edit it, and specify that you want to return a list of the Excel tables:

Choose Excel tables

Tables may seem an odd choice, but this is internally how SQL will get at the worksheet names.

On the Variable Mappings tab, choose to store the second bit of information returned in your string variable:

Storing the worksheet name

Type in number 2 as the index number (for why, see hint below).

When you get a list of tables as above, you get the table catalog name, table schema name, table name and table type.  All but one of these bits of information is useless (and the first two don't even seem to be set).  You want to store the third bit of information, the table name, which is index number 2 (since nearly everything in SSIS is numbered from 0).

Step 5 - Configuring the data flow task

Here's what the data flow task will look like:

Data flow task

The details of how to export to SQL Server aren't shown here.  The Excel source should point to the Shopping.xlsx workbook.

 

For the Excel source, choose as the data access mode that the worksheet name is contained in a table variable, and specify which:

Excel source configuration

Each time round the loop, the Excel source will point to the worksheet whose name will be contained in the SheetName variable.

Here are the mappings for the SQL Server destination:

Mappings

The mappings from Excel source to SQL Server destination table AdoShopping.

 Step 6 - running the package

The final thing to do is to run the package:

Control flow running

What you should see in control flow

The data flow task will flash once for each worksheet:

Final data flow

At the end of the package, this is what the data flow should show (since there are 6 rows in the March worksheet, the last one to be imported).

 

And that's it!  Told you it was a messy procedure ... 

This blog has 0 threads Add post