home
SSIS - Import all files to staging tables and log process
Author Nigel Rivett
Description
create a package to import files from a folder to a staging table
Files are imported using a file type defined by a file mask
The staging table includes the file name and datetime of the import
The end of the file import is logged with the filename and rowcount
This processing description shows how to create the import together with all tables needed.
The filemask for this import is AccountTransaction*.csv and table AccountTransaction_stg
The process will import all files with that filemask into that table.
To import other filetypes add and configure further loop containers
Note: this process concentrates on csv files with the column names in the first row
Other file formats can be added by configuring the fataflow source component
Process
log start of package
truncate staging table for filetype1
For each file with name format for filetype1
import file to staging table
log import with rowcount
truncate staging table for filetype2
For each file with name format for filetype2
import file to staging table
log import with rowcount
log end of package
Execute stored procedures to process all files
Detect all the files of the correct type in a folder
Create a string variable a_ImportFolder
Set the value to the import folder location
Create a filemask string variable for each filetype
a_FileMask_AccountTransaction AccountTransaction*.csv
Create a string variable for the filenames to import
ww_FileName_AccountTransaction
Create a Foreach Loop container
Name the container ForeachFile AccountTransaction
Collection
Enumerator Foreach File Enumerator
IsDestinationPathVariable true
Expressions
Directory @[User::a_ImportFolder]
FileSpec @[User::a_FIleMask_AccountTransaction]
Retrieve file name
Name and Extension (we already have the folder in a variable)
Variable Mappings
User:ww.FileName_AccountTransaction 0
This container will now execute for each of the files in the folder with the correct file mask
Truncate the staging table
Create an execute sql task
Name Truncate AccountTransaction_stg
Connection Staging database connection
SQLSourceType Direct input
SQLStatement truncate table AccountTransaction_stg set to staging table name
Set a precendence constraint to execute this before the for each loop container
Import the files
create a Data Flow Task named Import AccountTransaction_stg
Place this task inside the ForeachFile AccountTransaction container
Configure the Data Flow Task
Add a flat File Source
Flat file connection manager New
Configure Flat file connection manager
General
Name AccountTransaction
FileName Browse to test file
Text qualifier "
select Column names in in first data row
Columns
Check this is correct
Advanced
Set each of the OutputColumnWidth values to 255
Can be configured correctly if you know the values but this is a good start
The default is 50 which is often too small
Note - for truncation errors change the size in the connection manager
Preview
Check this looks correct
Add a OLE DB Destination
Drag the data flow path from the flat file source to the OLE DB Destination
Configure OLE DB Destination
Connection Manager
OLE DB Connection manager Staging database connection
Data access mode Table or view - fast load
Name of table or view New
This will show the script to create the table
change the table name in the create statement to AccountTransaction_stg
Save the script in an sql file to change later
click OK to execute the script and create the table
Mappings
This should map each column fgrom the file to the table
Click OK to save the import
This is now all you need to import the files of this type
It needs more configuration and changes but now is a good time to test
The dataflow task is set to import your test file
Right click and execute the task.
There may be trucation errors
Look at the progress tab
Find the column in error and the row
Check the source data to see if this is in fact correct
Update the File manager (advanced) to set the size for this column
Update the staging table script for this column size and run it
Double click on the flat file source and save
Double click on the OLE DB Destination and save
Rerun
May need to do this several times
Now run the truncate task followed by the for each loop container and it should import the test file - try renaming the file
Configure the AccountTransaction connection manager to use the filename variable
Expressions
ConnectionString @[User::a_ImportFolder] + "\\" + @[User::ww_FileNameAccountTransaction] (note the escaped \)
note: for an excel connection set the ExcelFilePath
Test the import
Execute the task Truncate AccountTransaction_stg
Execute the container ForeachFile AccountTransaction
Add the source filename to the staging table
Change the AccountTransaction_stg table script to add two columns at the end
z_FileName varchar(200) not null
z_date datetime not null default getdate()
Recreate the staging table using this script
z_date is to keep a record of when the rows were added to the table
Open the data flow task Import AccountTransaction_stg
Add a derived column dataflow component
Place this between the Row Count and Destination components
Data Flow Path
Row Count -> Derived Column
Derived Column -> OLE DB Destination
Edit the Derived Column component
Derived Column Name z_FileName
Derived Column Add as new column
Expression (DT_STR,200,1252) @[User::ww_FileName_AccountTransaction]
Open the OLE DB Destination component
Mappings - map z_FileName to z_FileName
Execute the container ForeachFile AccountTransaction
Check the results
select top 10 * from AccountTransaction_stg
check z_FileName and z_date
Now we can add logging to the package
Logging
Create a logging table Trace
if exists (select * from sys.tables where name = 'Trace')
drop table Trace
go
CREATE TABLE [dbo].[Trace]
(
[Trace_id] [int] IDENTITY(1,1) NOT NULL,
[EventDate] [datetime] NOT NULL default getdate(),
[Entity] [varchar](100) NOT NULL,
[key1] [varchar](100) NULL,
[key2] [varchar](100) NULL,
[key3] [varchar](100) NULL,
[data1] [varchar](max) NULL,
[data2] [varchar](max) NULL,
[data3] [varchar](max) NULL,
[UserName] [varchar](128) NULL default suser_sname()
)
go
Create a string variable for each feature you wish to log
I tend to use a different variable for each file type
e.g.
w_log_Package_start
w_log_AccountTransaction_load
w_log_Package_end
Create Int32 variables for the rowcounts
I name these w1_ to show that they are used in another variable
w1_AccountTransaction_Rowcount
In the dataflow task (Import AccountTransaction_stg)
Add a Row Count task
Change the dataflow path to include the Rowcount
delete
Source -> Rowcount
Rowcount -> Destination
Set the Row Count task variable to w1_AccountTransaction_Rowcount
Now, when a file is loaded the rowcount will be in this variable ready for logging
In the database create a stored procedure to support the logging
create proc s_SSIS_LogImport
@Package varchar(128) ,
@FileName varchar(128) ,
@Rowcount varchar(20)
as
insert Trace (Entity, key1, key2, data1)
select @Package, 'Import', @FileName, @Rowcount
go
set the expression for the variable w_log_AccountTransaction_load
Expression "s_SSIS_LogImport"
+ "'" + @[System::PackageName]
+ "','" + @[User::ww_FileName_AccountTransaction]
+ "','" + (DT_STR,20,1252) @[User::w1_AccountTransaction_Rowcount]
+ "'"
Create an execute sql task
Name Log Import AccountTransaction
Connection Staging database connection
SQLSourceType Variable
SourceVariable User::w_log_AccountTransaction_load
Move this task to inside the ForeachFile AccountTransaction container
Set a precendence constraint to execute this after the file import
Test the logging
Execute the task Truncate AccountTransaction_stg
Execute the container ForeachFile AccountTransaction
In the database execute the query
select top 10 * from Trace order by Trace_id desc
Using this method we need for each file type
Variables
a_FileMask_AccountTransaction
w_log_AccountTransaction_load
w1_AccountTransaction_Rowcount
ww_FileName_AccountTransaction
Tasks
Truncate AccountTransaction_stg
ForeachFile AccountTransaction
Import AccountTransaction_stg
Log Import AccountTransaction
File Connection managers
AccountTransaction
home