Wednesday, November 9, 2011

Recently came across a problem with nicely formatted excel file(with all pictures, bolded colorful text headings, etc), I wanted to import the important data into our database using SSIS, however with all these junk info at the start. SSIS didn't behave as I wanted, instead of importing all the prices figures it imported all null values instead. 

I did some search on the net, found that SSIS was defaulted using first 8 rows in an Excel file to determine the datatype, which in my case, all first 8 lines were blank. So problem found, but to fix it, how?

There are couple solutions to this issue:
1. Simple and maybe the most easier way to do it. Change the 'Openrowset' Property under Excel Data Source's Properties, from 'Sheet1$', to Sheet1$A1:G1000. basically manually set the range you want, which handles this problem perfectly.

2. Use Script component to code for filtering, Read the file in as a whole inside script component, then looping through each individual rows to filter which to keep. 

3. Manually change the property field type of source Excel file column, in my case, set "price " column from 'number' to 'text'

4. MSDN suggests there's an registry key you may change to determine how many rows to check by SSIS before determination of the column type. Named: TypeGuessRows under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel.