This post is the 1st in a series of 3 planned installments about using Excel data files with Microsoft SQL Server Integration Services.
- Connections and components
- Tables and data types
- Issues and alternatives
The Excel Driver
Excel connectivity in SQL Server Integration Services uses the Jet provider and the Excel ISAM (indexed sequential access method) driver that is a component of the Microsoft Jet database engine.
Since the Jet provider expects an Access database file (*.mdb) by default, you have to give Jet additional information when you want it to use one of its ISAM drivers for a different file type like Excel. If you were writing your own connection string for the Excel driver, you would provide this additional information in the Extended Properties argument. For Excel 97-2003, the Excel connection manager handles this for you; for Excel 2007, you have to enter at least one value manually on the All tab of the Data Link Properties dialog box.
- The one required value is the Excel version: use "Excel 8.0" for Excel 97, 2000, 2002, and 2003 (Excel versions 8 through 11), and "Excel 12.0" for Excel 2007.
- The other commonly used value is "HDR=Yes" (or No), which corresponds to the First row has column names option in the Excel connection manager. If not specified, the default is "Yes". When you don't have headings, the driver assigns F1, F2, and so forth as temporary column headings. When you don't have headings but mistakenly tell the driver that you do, your first row of data disappears to serve as column headings.
You cannot use the driver to connect to a password-protected workbook. Don't waste time trying or arguing. However, the driver can connect to a password-protected workbook if the workbook is already open in the Excel application. But using the driver to read from a workbook that's open in Excel causes a memory leak that can cause Excel to stop responding.
The Excel driver is still 32-bit only
On a 64-bit computer, you have to run a package that uses Excel in 32-bit WOW mode. There is only a 32-bit version of the Excel driver. The program manager for ACE once told me that they hope to ship a 64-bit version of the ACE provider in the next version of Microsoft Office, Office 14. We'll see! (There's no Office 13.)
To get the 32-bit version of dtexec so you can run packages in 32-bit mode, make sure that you select Management Tools - Complete during SQL Server 2008 Setup, or Client Tools during SQL Server 2005 Setup, on a 64-bit computer.
For more information about 64-bit concerns, see 64-bit Considerations for SQL Server Integration Services.
Working with Excel 2003 and earlier
Working with Excel 2007 (12.0)
To work with Excel 2007 files, you have to use the "ACE" provider, which appears in the providers list as "Microsoft Office 12.0 Access Database Engine OLE DB Provider". The Office 12 team took the code for Jet and the Excel driver and updated it to work with the Excel 12/2007 file format (.xlsx). Of course it also works with previous versions of Excel file formats. If you don't have Office 2007, download the ACE provider at 2007 Office System Driver: Data Connectivity Components.
- In the SQL Server Import and Export Wizard, select "Microsoft Office 12.0 Access Database Engine OLE DB Provider" as the data source or destination. Type or paste the full path and filename as the value of Data Source on the Connection" tab of the Data Link Properties dialog box. Then be sure to add "Excel 12.0" as the value of Extended Properties on the All tab.
- In an Integration Services package, use an OLE DB Connection Manager, OLE DB Source and OLE DB Destination with Excel 2007 files. Choose and configure the provider as just described. (The plan to update the Excel Connection Manager to offer Excel 2007 as an option was postponed past SQL Server 2008 RTM.)
In an Integration Services package, the Excel Connection Manager now supports the Excel 2007 file format. Also, in the SQL Server Import and Export Wizard, the "Excel" source or destination option now supports the Excel 2007 format - you do not have to select the ACE Provider. In both cases, however, the Excel 97-2003 format remains the default setting for now.
This information is summarized in Books Online in How to: Connect to an Excel Workbook.
Here's an interesting gotcha! The Foreach File enumerator secretly appends a wildcard to the end of the file extension that you specify. So let's say that you configure the Foreach File enumerator to return a list of Excel 2003 .xls files. Surprise! You also get any Excel 2007 .xlsx files that exist in the same path, possibly causing your package to fail. There appears to be no way to specify, "Give me .xls but not .xlsx".
In our next installment...
In Tables and data types. we'll list the objects that represent tables in an Excel data source, and talk about how to query them. We'll also look at the data types recognized by the Excel driver, and the corresponding Integration Services data types, along with conversion issues.