home
SSIS - Import all files to staging tables and log process
Author Nigel Rivett

Description

create a package to import files from a folder to a staging table
Files are imported using a file type defined by a file mask
The staging table includes the file name and datetime of the import
The end of the file import is logged with the filename and rowcount

This processing description shows how to create the import together with all tables needed.
The filemask for this import is AccountTransaction*.csv and table AccountTransaction_stg
The process will import all files with that filemask into that table.
To import other filetypes add and configure further loop containers

Note: this process concentrates on csv files with the column names in the first row
Other file formats can be added by configuring the fataflow source component


Process

	log start of package
	truncate staging table for filetype1
	For each file with name format for filetype1
		import file to staging table
		log import with rowcount
	truncate staging table for filetype2
	For each file with name format for filetype2
		import file to staging table
		log import with rowcount
	log end of package
	Execute stored procedures to process all files

Detect all the files of the correct type in a folder

Create a string variable a_ImportFolder
	Set the value to the import folder location
Create a filemask string variable for each filetype
	a_FileMask_AccountTransaction 	AccountTransaction*.csv
Create a string variable for the filenames to import
	ww_FileName_AccountTransaction

Create a Foreach Loop container
	Name the container ForeachFile AccountTransaction
	Collection
		Enumerator	Foreach File Enumerator
		IsDestinationPathVariable	true
		Expressions
			Directory	@[User::a_ImportFolder]
			FileSpec	@[User::a_FIleMask_AccountTransaction]
		Retrieve file name
			Name and Extension	(we already have the folder in a variable)
	Variable Mappings
		User:ww.FileName_AccountTransaction		0

	This container will now execute for each of the files in the folder with the correct file mask

Truncate the staging table

Create an execute sql task
	Name		Truncate AccountTransaction_stg
	Connection	Staging database connection
	SQLSourceType	Direct input
	SQLStatement	truncate table AccountTransaction_stg  	set to staging table name
Set a precendence constraint to execute this before the for each loop container

Import the files

create a Data Flow Task named Import AccountTransaction_stg
Place this task inside the ForeachFile AccountTransaction container

Configure the 	Data Flow Task
	Add a flat File Source
	Flat file connection manager	New
	Configure Flat file connection manager
		General
			Name		AccountTransaction
			FileName	Browse to test file
			Text qualifier 	"
			select Column names in in first data row
		Columns
			Check this is correct
		Advanced
			Set each of the OutputColumnWidth values to 255
				Can be configured correctly if you know the values but this is a good start
				The default is 50 which is often too small
				Note - for truncation errors change the size in the connection manager
		Preview	
			Check this looks correct

	Add a OLE DB Destination
	Drag the data flow path from the flat file source to the OLE DB Destination
	Configure OLE DB Destination
		Connection Manager
			OLE DB Connection manager	Staging database connection
			Data access mode		Table or view - fast load
			Name of table or view		New
				This will show the script to create the table
				change the table name in the create statement to AccountTransaction_stg
				Save the script in an sql file to change later
				click OK to execute the script and create the table
		Mappings
			This should map each column fgrom the file to the table
		Click OK to save the import

This is now all you need to import the files of this type
It needs more configuration and changes but now is a good time to test

The dataflow task is set to import your test file
Right click and execute the task.
There may be trucation errors
	Look at the progress tab
	Find the column in error and the row
	Check the source data to see if this is in fact correct
	Update the File manager (advanced) to set the size for this column
	Update the staging table script for this column size and run it
	Double click on the flat file source and save
	Double click on the OLE DB Destination and save
	Rerun
 	May need to do this several times

Now run the truncate task followed by the for each loop container and it should import the test file - try renaming the file

Configure the AccountTransaction connection manager to use the filename variable
	Expressions
		ConnectionString	@[User::a_ImportFolder] + "\\" + @[User::ww_FileNameAccountTransaction]	(note the escaped \)
		note: for an excel connection set the ExcelFilePath

Test the import

Execute the task Truncate AccountTransaction_stg
Execute the container ForeachFile AccountTransaction

Add the source filename to the staging table

Change the AccountTransaction_stg table script to add two columns at the end 
	z_FileName varchar(200) not null
	z_date datetime not null default getdate()
Recreate the staging table using this script
z_date is to keep a record of when the rows were added to the table

Open the data flow task Import AccountTransaction_stg
Add a derived column dataflow component
Place this between the Row Count and Destination components
Data Flow Path
	Row Count -> Derived Column
	Derived Column -> OLE DB Destination

Edit the Derived Column component
	Derived Column Name 	z_FileName
	Derived Column		Add as new column
	Expression		(DT_STR,200,1252) @[User::ww_FileName_AccountTransaction]

Open the OLE DB Destination component
	Mappings - map z_FileName to z_FileName
Execute the container ForeachFile AccountTransaction
Check the results
	select top 10 * from AccountTransaction_stg
	check z_FileName and z_date

Now we can add logging to the package

Logging

Create a logging table Trace

if exists (select * from sys.tables where name = 'Trace')
	drop table Trace
go
CREATE TABLE [dbo].[Trace]
    (
    [Trace_id] [int] IDENTITY(1,1) NOT NULL,
    [EventDate] [datetime] NOT NULL default getdate(),
    [Entity] [varchar](100) NOT NULL,
    [key1] [varchar](100) NULL,
    [key2] [varchar](100) NULL,
    [key3] [varchar](100) NULL,
    [data1] [varchar](max) NULL,
    [data2] [varchar](max) NULL,
    [data3] [varchar](max) NULL,
    [UserName] [varchar](128) NULL default suser_sname()
    )
go

Create a string variable for each feature you wish to log
I tend to use a different variable for each file type
e.g.
	w_log_Package_start
	w_log_AccountTransaction_load
	w_log_Package_end

Create Int32 variables for the rowcounts
I name these w1_ to show that they are used in another variable
	w1_AccountTransaction_Rowcount

In the dataflow task (Import AccountTransaction_stg)
	Add a Row Count task
	Change the dataflow path to include the Rowcount
		delete
		Source -> Rowcount
		Rowcount -> Destination
	Set the Row Count task variable to w1_AccountTransaction_Rowcount

Now, when a file is loaded the rowcount will be in this variable ready for logging

In the database create a stored procedure to support the logging
create proc s_SSIS_LogImport
@Package	varchar(128) ,
@FileName	varchar(128) ,
@Rowcount	varchar(20)
as
	insert Trace (Entity, key1, key2, data1)
	select @Package, 'Import', @FileName, @Rowcount
go

set the expression for the variable w_log_AccountTransaction_load
	Expression "s_SSIS_LogImport" 
			+ "'" +  @[System::PackageName] 
			+ "','" +  @[User::ww_FileName_AccountTransaction] 
			+ "','" + (DT_STR,20,1252) @[User::w1_AccountTransaction_Rowcount] 
			+ "'"

Create an execute sql task
	Name		Log Import AccountTransaction
	Connection	Staging database connection
	SQLSourceType	Variable
	SourceVariable	User::w_log_AccountTransaction_load
Move this task to inside the ForeachFile AccountTransaction container
Set a precendence constraint to execute this after the file import

Test the logging

Execute the task Truncate AccountTransaction_stg
Execute the container ForeachFile AccountTransaction

In the database execute the query
	select top 10 * from Trace order by Trace_id desc

Using this method we need for each file type
Variables
	a_FileMask_AccountTransaction
	w_log_AccountTransaction_load
	w1_AccountTransaction_Rowcount
	ww_FileName_AccountTransaction
Tasks
	Truncate AccountTransaction_stg
	ForeachFile AccountTransaction
		Import AccountTransaction_stg
		Log Import AccountTransaction
File Connection managers
	AccountTransaction



home