Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
561 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
How to loop over the worksheets in an Excel workbook using Integration Services |
---|
Follow this blog to learn how to use an SSIS package to loop over the worksheets in an Excel XLSX workbook, importing the contents of each. |
This blog shows how to loop over the worksheets in an Excel workbook, importing the contents of each. The technique will only work if the worksheets are all in the same format.
You can follow this blog by downloading and unzipping this file. This blog was written using SQL Server 2016 and Excel 2016 (for earlier versions of software see my previous blog; or have a look at our Excel training options).
We want to consolidate three separate months of shopping data into a single SQL Server table:
Each month contains a separate list of shopping trips, but (crucially) they share the same format.
Here's what the end result should be:
What the final answer should look like (although the rows won't necessarily be in this order).
If you want to work through the case study in this blog, first run this script in Management Studio to create the table you'll need:
-- create table for shopping data
CREATE TABLE AdoShopping(
ShopName nvarchar(255),
ShopDate date,
Spending decimal(10,2)
)
GO
SELECT * FROM AdoShopping
That's enough about what we need to do. Time now to do it!
We need to get a list of all of the names of the worksheets in our workbook:
We'll never see this, but this is the collection of names that we need to loop over.
We'll accomplish the above using a Foreach ADO.NET Schema Rowset Enumerator loop. You'd normally use this to loop over tables, views or stored procedures in a SQL Server database, but it can also be used - as we'll see - to loop over worksheet names.
The first thing to do is to create a variable to hold each worksheet's name in turn:
It's crucial that you give a value to this variable equal to the name of one of the worksheets that you want to import, otherwise you won't be able to get the data flow task working later.
Create a package which deletes any old rows from the AdoShopping table, then has an empty loop and an empty data flow task:
The Execute SQL task will run the command TRUNCATE TABLE AdoShopping (if you don't know how to set this up, you're probably reading the wrong blog). See below for how to configure the other tasks.
Double-click on the Foreach Loop task to edit it, and say what sort of loop it is:
On the Collection tab, specify that this loop should be an ADO.NET Schema Rowset Enumerator (!).
Now choose to create a new connection:
Choose to create a new connection
Ask to create a new connection:
Click to create a new connection.
Choose the provider that you want to use:
It may seem strange that you use an Access provider to connect to an Excel worksheet, but ...
Now type in the Excel workbook to which you want to link:
Make sure that you include the full file path and file name (obviously your path will be different to the one shown here).
On the All tab, set the Extended Properties to Excel 12.0;HDR-YES;:
Type this in carefully!
HDR=YES tells SSIS that the first row of your Excel worksheet contains column headers, not data, but the more important information is the internal Excel version number.
Test your connection:
Click on this button at the bottom left to test your connection.
You should get this message:
Yeah! Or if you get an error, check you haven't left the workbook open in Excel.
Select OK until you exit the dialog box for configuring your connection, then optionally rename the connection:
Rename the connection that you've created for your package.
Double-click on your Foreach Loop to edit it, and specify that you want to return a list of the Excel tables:
Tables may seem an odd choice, but this is internally how SQL will get at the worksheet names.
On the Variable Mappings tab, choose to store the second bit of information returned in your string variable:
Type in number 2 as the index number (for why, see hint below).
When you get a list of tables as above, you get the table catalog name, table schema name, table name and table type. All but one of these bits of information is useless (and the first two don't even seem to be set). You want to store the third bit of information, the table name, which is index number 2 (since nearly everything in SSIS is numbered from 0).
Here's what the data flow task will look like:
The details of how to export to SQL Server aren't shown here. The Excel source should point to the Shopping.xlsx workbook.
For the Excel source, choose as the data access mode that the worksheet name is contained in a table variable, and specify which:
Each time round the loop, the Excel source will point to the worksheet whose name will be contained in the SheetName variable.
Here are the mappings for the SQL Server destination:
The mappings from Excel source to SQL Server destination table AdoShopping.
The final thing to do is to run the package:
What you should see in control flow
The data flow task will flash once for each worksheet:
At the end of the package, this is what the data flow should show (since there are 6 rows in the March worksheet, the last one to be imported).
And that's it! Told you it was a messy procedure ...
Some other pages relevant to the above blog include:
From: | jadibhai |
When: | 04 Mar 20 at 18:38 |
After following your steps, I got stuck at the point where i have to choose a provider.
From the drop down list, I could not find "Microsoft Office 12.0 Access Database Engine Oledb provider".
I'm using VS 2015 enterprise edition (32 bit) and Office 2016 (64 bit).
I downloaded and installed "Microsoft Access Database Engine 2016 Redistributable (64bit)" but no luck.
When i tried installing "Microsoft Access Database Engine 2016 Redistributable (32 bit)" instead, i got an error stating i first have to uninstall the 64 bit version of Office 2016.
Could the reason i'm not seeing the provider listed in VS 2015 be that I need Office 2016 32 bit instead of 64? Or do you know of another reason.
From: | Andy B |
When: | 06 Mar 20 at 09:08 |
I have to be honest, I don't know! I had the 32-bit version of Office installed on my computer when I wrote this blog. I've just tried this again, and I have the Microsoft Access 12.0 and Access 16.0 drivers listed. Maybe you have to have Access installed? Let me know how you get on.
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.