Recently an internal Microsoft team ran into performance issues when they switched from using the Jet Provider and writing out Excel 2003 files, to using the Office 12/2007 ACE Provider and writing out Excel 2007 files from their Integration Services package. Their inquiries brought the following information to light:
There are 3 reasons for reduced performance with the Office 12 ACE Provider when used with Excel 2007 files, either binary (.xlsb)or XML (*.xlsx):
- The ACE Provider reads the entire Excel 2007 file when it first opens it.
- It does not use indexing, as earlier versions of the provider did, and as ACE itself does with earlier Excel file formats.
- Because of the lack of indexing, it retains the entire dataset in memory.
If this reduced performance with ACE and Excel 2007 files becomes a showstopper, then you can consider the following workarounds:
- Use the Excel 2003 (.xls) file format, with either the ACE Provider or the Jet Provider, if your data does not exceed the ~65K rows maximum for the Excel 2003 format.
- Prefer the Excel 2007 binary (.xlsb) file format over the default Excel 2007 XML (.xlsx) format for faster parsing.
- Prefer several smaller Excel 2007 files over 1 larger file for faster parsing.
Thanks to Integration Services developer and blogger Matt Masson for passing along this information, and to Office developer Ofir Reuveny for providing the causes and workarounds.