SQL Server Forum / General / Data Warehousing / April 2005
Importing Data from an EXCEL File
|
|
Thread rating:  |
John Smith - 24 Apr 2005 02:56 GMT hi all,
first of all, i apologize for posting this in multiple forums; i don't know which is the most appropriate to direct this question to.
on to business... i am trying to import data from an excel file using the OPENROWSET function in sql server. unfortunately, the results of some fields were truncated to 255 character-length.
here's what i've done so far: 1. tried using other function such as OPENDATASOURCE instead, but that function not only gives the same problem, it has its own set of issues too (like when the field was cast to a sql server's datatype). so i threw out this option. 2. tried using dts (same problem, plus i'd like to automate it as a t-sql procedure instead). 3. tweaking the registry setting for HKEY..MACHINE/Software/..Jet/Excel etc. as suggested from a Microsoft knowledge-base article.
no-dice. do you have an answer to this problem? or, perhaps a different approach to importing data from excel?
thank you very much, ~ js
Hari Prasad - 24 Apr 2005 13:55 GMT Hi,
Steps:-
1. store Excel sheet in 97-2000 format
2. Select in DTS import Wizard Excel 97/200 as data source
3. Check "transform" - "create a destination table" - "Edit SQL".
4. Change CREATE TABLE [dbname].[dbo].[sheet1] ([col1] nvarchar (255))
5. Change it to nvarchar (1000) or so and click ok and verify.
6. If you have pure Excel 2000 format you shouldnt have any issues
If you still have issues then ; have a look into the below KB article:-
http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q189897
Thanks Hari SQL Server MVP
> hi all, > [quoted text clipped - 21 lines] > thank you very much, > ~ js John Smith - 24 Apr 2005 23:27 GMT hi hari,
thank you for your input. unfortunately, i had already tried that prior to posting my question and still was not successful (i have office xp, which i thought should've taken care of that). but i will definitely try again again to see if i missed something during my checks.
secondly, i had also followed the method (registry tweak) outlined in that kb-article from the link you included in your reply, that is kb-article what i was referring to in my earlier post (i apologize i wasn't being clear on my question).
and last but not least, while i will certain try the dts method again to see if i missed anything, i am looking for a more programmatic solution using the OPENROWSET or OPENDATASOURCE functions than the DTS. although, understandably if the DTS portion works, the function-use should also work.
the following is what i used in my import code, perhaps you can point out the error if you see any?
OPENROWSET attempt: select * from OpenRowSet ( 'Microsoft.Jet.OLEDB.4.0' , 'Excel 8.0; DATABASE=c:\DataSourceFile.xls' , 'select * from [Sheet1$]' ) /* result: still truncated; perhaps instead of 'Excel 8.0' I should use a different string for Excel 97-2000 version? */
OPENDATASOURCE attempt: select cast(RowId as int) RowId , cast([ReallyLongString] as nvarchar(3900)) ReallyLongString from OpenDataSource( 'Microsoft.Jet.OLEDB.4.0', 'Data Source="c:\DataSourceFile.xls";Extended properties=Excel 8.0' )...[Sheet1$] /* result: still truncated; same deal above, plus OPENDATASOURCE other issues. */
thank you again, ~js
> Hi, > [quoted text clipped - 45 lines] >>thank you very much, >>~ js John Smith - 24 Apr 2005 23:27 GMT hi hari,
thank you for your input. unfortunately, i had already tried that prior to posting my question and still was not successful (i have office xp, which i thought should've taken care of that). but i will definitely try again again to see if i missed something during my checks.
secondly, i had also followed the method (registry tweak) outlined in that kb-article from the link you included in your reply, that is kb-article what i was referring to in my earlier post (i apologize i wasn't being clear on my question).
and last but not least, while i will certain try the dts method again to see if i missed anything, i am looking for a more programmatic solution using the OPENROWSET or OPENDATASOURCE functions than the DTS. although, understandably if the DTS portion works, the function-use should also work.
the following is what i used in my import code, perhaps you can point out the error if you see any?
OPENROWSET attempt: select * from OpenRowSet ( 'Microsoft.Jet.OLEDB.4.0' , 'Excel 8.0; DATABASE=c:\DataSourceFile.xls' , 'select * from [Sheet1$]' ) /* result: still truncated; perhaps instead of 'Excel 8.0' I should use a different string for Excel 97-2000 version? */
OPENDATASOURCE attempt: select cast(RowId as int) RowId , cast([ReallyLongString] as nvarchar(3900)) ReallyLongString from OpenDataSource( 'Microsoft.Jet.OLEDB.4.0', 'Data Source="c:\DataSourceFile.xls";Extended properties=Excel 8.0' )...[Sheet1$] /* result: still truncated; same deal above, plus OPENDATASOURCE other issues. */
thank you again, ~js
> Hi, > [quoted text clipped - 45 lines] >>thank you very much, >>~ js Steve Kass - 25 Apr 2005 00:33 GMT John,
Are you certain the data is truncated? If you are looking at it in Query Analyzer, perhaps you have never increased the value of "Maximum Characters per Column" in Tools|Options? The default value for that setting, which governs Query Analyzer output only, is 256.
The code you posted here does not import anything, nor does it demonstrate that the values are truncated. Just to be sure, can you post the results of this?
select * into #tempfile from OpenRowSet...
select top 10 datalength(ReallyLongString), ReallyLongString from #tempfile order by datalength(ReallyLongString) desc
Steve Kass Drew University
> hi hari, > [quoted text clipped - 95 lines] >>> thank you very much, >>> ~ js John Smith - 25 Apr 2005 01:00 GMT hi steve,
yes i am pretty sure. i guess i also have a lot to learn in the art of asking question the proper way :)
i did have the following in my procedure (not unlike the #tempfile sample you suggested earlier): if object_id('[raw.importedData]') is not null drop table [raw.importedData]
as well as the appropriate
select * into [raw.importedData] ... for both excerpted code earlier. i only had excerpted my code minimally to illustrate that i've made proper attempts before i ask. :) don't want people to think i didn't try my best before asking the question.
i also had used a similar method (as you suggested) to check the data, for example: select max(len(ltrim(rtrim([ReallyLongString])))) stringLength from [raw.importedData]
also, select [ReallyLongString] from [raw.importedData] where len(ltrim(rtrim([ReallyLongString]))) = ( select max(len(ltrim(rtrim([ReallyLongString])))) stringLength from [raw.importedData] )
then eye-balled the resulting string from the original string in the EXCEL file, and sure enough it's truncated.
my confusion is, i really thought the excel 97 - 2000 thing (that i tried before hari suggested it, and that he essentially "confirmed" for me) would have taken care of the problem. and just to make sure i open and re-saved the EXCEL file to ensure that it's saved as the proper version of excel that i have installed locally in the computer. still no dice... i'm stumped.
i guess my next attempt is to reinstall office?
~ js.
> John, > [quoted text clipped - 118 lines] >>>> thank you very much, >>>> ~ js Steve Kass - 25 Apr 2005 01:30 GMT See inline. -SK
> i also had used a similar method (as you suggested) to check the data, > for example: > select max(len(ltrim(rtrim([ReallyLongString])))) stringLength > from [raw.importedData] And what is the result when you do this?
> also, > select [ReallyLongString] [quoted text clipped - 6 lines] > then eye-balled the resulting string from the original string in the > EXCEL file, and sure enough it's truncated. If you eyeballed it in Query Analyzer, my question is still unanswered. Did you try changing the Query Analyzer setting I mentioned?
> i guess my next attempt is to reinstall office? Not until you post some actual results that prove the data is being truncated, I wouldn't.
I would be surprised if reinstalling office turns out to be the solution.
> ~ js. > [quoted text clipped - 121 lines] >>>>> thank you very much, >>>>> ~ js John Smith - 25 Apr 2005 04:14 GMT The result of that max(len(ltrim(rtrim(..)))) operation was 255. Sorry, I didn't answer your question. My Query Analyzer was set at 8000 characters per column, from the following settings Options => Results => Maximum characters per column: 8000.
Is there any other Excel string identifier for Excel version 97-2000 other than the string "Excel 8.0"?
> See inline. -SK > [quoted text clipped - 152 lines] >>>>>> thank you very much, >>>>>> ~ js Steve Kass - 25 Apr 2005 06:01 GMT Is the first 255+ character row not near the top of the Excel file?
Check the registry values
HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Jet/4.0/Engines/Excel/ImportMixedTypes HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Jet/4.0/Engines/Excel/TypeGuessRows
The first should be set to 'text' and the second should be set to a number at least as great as the row number of the first more-than-256-character row.
Alternatively, you can put a dummy long value at the top of the Excel file.
Best I can tell, if the first TypeGuessRows-many values in a column are 255 characters or less, the column is imported as nvarchar(255). If there is a longer value in the column, it is imported as ntext.
If it doesn't work the first time, try also adding IMEX=1 to your connection string.
SK
> The result of that max(len(ltrim(rtrim(..)))) operation was 255. > Sorry, I didn't answer your question. [quoted text clipped - 165 lines] >>>>>>> thank you very much, >>>>>>> ~ js John Smith - 26 Apr 2005 02:00 GMT hi steve, thanks for your input. i had made the registry tweaks as you indicated (according to microsoft's kb-article), but it didn't make any difference. however, your tip on adding IMEX=1 to the connection string along with installing office 2003 seem to do the trick. all is well.
thank you again for all your input.
sincerely, ~js
> Is the first 255+ character row not near the top of the Excel file? > [quoted text clipped - 190 lines] >>>>>>>> thank you very much, >>>>>>>> ~ js Steve Kass - 26 Apr 2005 02:33 GMT John,
Glad to hear you've worked this out. I'm curious - did you leave the registry tweak in after all? It appears the IMEX=1 setting makes the provider actually obey the registry setting
(see http://support.microsoft.com/?id=194124 for some info on IMEX - thanks to oj for first finding this link for me.)
SK
> hi steve, > thanks for your input. [quoted text clipped - 204 lines] >>>>>>>>> thank you very much, >>>>>>>>> ~ js John Smith - 26 Apr 2005 07:00 GMT yes steve, i left them in the registry ImportMixedTypes: Text TypeGuessRow: 0 (hex) 0 is to indicate that it needs to scan all fields, i think.
i notice your email is drew.edu; are you an instructor/professor?
~ js
> John, > [quoted text clipped - 215 lines] >>>>>>>>>> thank you very much, >>>>>>>>>> ~ js oj - 26 Apr 2005 07:12 GMT oh yeah...
http://www.drew.edu/directory/people/search.php?first_name=steve&last_name=kass& dept=MATH
 Signature -oj
> yes steve, i left them in the registry > ImportMixedTypes: Text [quoted text clipped - 4 lines] > > ~ js
|
|
|