BLOGS BY TOPIC
BLOGS BY AUTHOR
BLOGS BY YEAR
Posted by Andy Brown on 30 January 2015
You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.
How to loop over most recently created files in SSIS files
I've previously blogged on how to loop over files in Integration Services, writing the name and path of each one found to a SQL Server table. However, how do you just include files modified since a given date, or pick out the most recently created one? This blog shows you the technique you'll need!
Creating the table of files
To save you time, gentle reader, here's the table I'm using for this example:
-- create a table to hold the files
CREATE TABLE tblFile(
-- number each file added (just a habit!)
FileId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
-- the name of each file (including path)
FileName nvarchar(max) NULL,
-- when the file was last updated
LastUpdated date NULL
Just run this script in SQL Server Management Studio to create a table with 3 columns.
Creating the package to capture each file's name
This is aimed at people who have been on our SSIS course, although you can probably get all of the information you need from my previous blog on looping over files alluded to above. Assume that we now have the following package:
The package deletes all of the rows in the files table, then adds then back in.
The main settings for the loop are as follows:
This loop will pick out all files with an SQL extension in a given folder (and all of its subfolders); yours could do something different.
Here are the variables I've used:
The variables capture the name of each file and - later, at any rate - the date it was last updated.
The loop over files captures each file name in the FileName variable:
The first (and only) bit of information returned for each file is its name and path, which we capture in the FileName variable.
Finally, here's the SQL statement being run each time round the loop:
This statement will insert a row into the table of files, substituting in the file name and when the file was last updated.
Here's the parameter mapping for this SQL statement:
The first question mark will become the file name, and the second the date it was last updated.
Capturing when the file was last updated
So far all we've done is get the name of the file; how do you get other attributes, such as its size, or when it was last updated? The only way seems to be to add a script task to get a reference to the file in code. First add the script task:
The script task to get at the file's details.
Now configure it (I'm using C# here, but the VB equivalent would be identical, apart from the choice of language):
We'll pass in the name of each file, and set when it was last updated.
Here's the C# script to get at the file's details:
public void Main()
// get the name of this file, including path
string filePath = Dts.Variables["FileName"].Value.ToString();
// get a reference to this file
System.IO.FileInfo eachFile = new System.IO.FileInfo(filePath);
// set when it was last updated (could also get other attributes)
Dts.Variables["LastUpdated"].Value = eachFile.LastWriteTime;
// finished OK
Dts.TaskResult = (int)ScriptResults.Success;
Because I'm feeling generous today, here's the VB equivalent code:
Public Sub Main()
'get the name of this file, including path
Dim filePath As String = Dts.Variables("FileName").Value.ToString
'get a reference to this file
Dim eachFile As New System.IO.FileInfo(filePath)
'set when it was last updated (could also get other attributes)
Dts.Variables("LastUpdated").Value = eachFile.LastWriteTime
Dts.TaskResult = Convert.ToInt32(ScriptResults.Success)
Like magic, when you run this you'll get the date/time each file was last updated:
The results of running this improved package - we have the file update date/time.