May
1
2012

SQL Server 2012 Property Search Assistant published

This post describes a utility program that makes it easier to set up the Property Search feature in SQL Server 2012.

I have just published the source code for the SQL Server 2012 Property Search Assistant on Codeplex.

The SQL Server 2012 Property Search Assistant is a small utility program that helps you to configure the Property Search feature which is new in SQL Server 2012 Full-Text Search. The Property Search Assistant is written in C#.

  1. About Property Search in SQL Server 2012
  2. Why do I need the Property Search Assistant?
  3. Prerequisites
  4. Running the Property Search Assistant
  5. Troubleshooting
  6. Disclaimers
  7. Suggestions for enhancement

About Property Search in SQL Server 2012

Full-Text Search in SQL Server 2012 now supports querying the properties of documents stored in the database, such as the "Author" property of a Microsoft Word document.

For example, the following query searches on a registered search property, Title, in the Document column of the Production.Document table of the AdventureWorks database. The query returns only documents whose Title property contains the string "Maintenance" or the string "Repair."

SELECT Document FROM Production.Document
  WHERE CONTAINS ( PROPERTY ( Document, 'Title' ), 'Maintenance OR Repair')
GO

 

Why do I need the Property Search Assistant?

However, you have to do several things before you can search the properties of documents that you've stored in your SQL Server 2012 databases:

  1. Enable full-text indexing on the tables and columns that you want to search.
  2. Create a new empty search property list with the CREATE SEARCH PROPERTY LIST statement.
  3. Associate the search property list with an existing full-text index with the SET SEARCH PROPERTY LIST clause of the ALTER FULLTEXT INDEX statement.
  4. Register each property that you want to be searchable with an individual ALTER SEARCH PROPERTY LIST statement.

The last step is the most challenging, because you have to provide a GUID value and an integer identifier for each property … and these are not so easy to find. (The GUID identifies a Property Set, and the integer identifies the individual property within in the set.) To find these values, you have to look them up in a list published online, or parse the output of a command-line tool that extracts properties. For more information about this tedious process, see Find Property Set GUIDs and Property Integer IDs for Search Properties.

Here's an example of the ALTER statement that you have to write to register each property that you want to be searchable:

ALTER SEARCH PROPERTY LIST [MyPropertyList]
    ADD 'System.Author'
        WITH
        (
            PROPERTY_SET_GUID = '{F29F85E0-4FF9-1068-AB91-08002B27B3D9}'
            PROPERTY_INT_ID = 4
        )

 

The Property Search Assistant makes this easy for you by letting you select the properties that you want to index in a friendly user interface, then generating the ALTER SEARCH PROPERTY LIST statements for you to run on your server.

Here is the user interface of the Property Search Assistant:

Here is a sample of the ready-to-run Transact-SQL output of the Property Search Assistant:

Prerequisites

Before you can run the Property Search Assistant, you have to prepare a few things.

  1. Download the code from Codeplex.
  2. Optionally – if you want to index Office 2010 documents, and you do not yet have the Office 2010 filters installed – download the Microsoft Office 2010 Filter Packs and then the Service Pack 1 for Microsoft Office Filter Pack 2010.
  3. Download at least a portion of the Windows SDK to obtain the command-line tool that is used in the background by the Property Search Assistant.

The Property Search Assistant extracts the properties from a sample document by running the command-line SDK tool, filtdump.exe, and parsing its output. To obtain filtdump.exe, install at least the Tools option in the Windows Native Code Development group of the Windows SDK Installation Options dialog box, as shown here:

For each property that it finds in the selected document, filtdump.exe outputs information similar to what is shown here:

The Property Search Assistant parses the output for lines that begin with "Attribute =" and extracts the property set GUID, the integer ID, and the property name that appear on the same line of text.

Running the Property Search Assistant

1. Run the code in Visual Studio, or build the .exe and run it from the command line or by double-clicking.

2. Find filtdump.exe on your hard drive if the program has not successfully found it for you.

3. Pick a sample document of a type whose properties you want to index. For example, if you store Microsoft Word documents in your database, choose any Word document that is not empty.

4. Optionally, if you want to add a USE <database> statement to the generated script, check the box and specify the name of the database. This adds Transact-SQL code similar to the following:

USE [MyDatabase]
GO

 

5. Specify the name of your new or existing search property list. Optionally, if you want to add the CREATE statement for the search property list, check the box. This adds Transact-SQL code similar to the following:

CREATE SEARCH PROPERTY LIST [MyPropertyList]
GO

 

6. Select or type the name of the folder where you want the utility to save the script file that it generates.

7. Optionally, change the name of the generated script file.

8. Click "Extract Properties."

9. In the list of properties, select the properties that you want to make searchable in your database by checking the check boxes next to their names.

10. Click "Generate Script."

11. The script file is saved to disk and displayed in a Notepad window. To run the script, you can copy and paste it into a new query window in SQL Server Management Studio. Adjust the names for the current database and for the search property list as appropriate.

Troubleshooting

I'm getting an error installing the Windows SDK.

You might have a newer version of the Visual C++ redistributables if you find an error 5100 near the end of the SDK installer log file. Uninstall them, re-run the SDK installation, then use Windows Update to retrieve the more recent versions of the Visual C++ redistributables. (On my own computer, I had to uninstall both the x86 and the x64 versions. I was grateful to find the solution to this problem in this forum post.)

Filtdump is raising an error, or no properties are found.

Is the document empty? If you selected an Office 2010 document, have you installed the Office 2010 filters? If you selected something like a PDF file, have you installed the filter for that file type?

Disclaimers

  1. I am a Microsoft employee.
  2. I am not a professional developer.
  3. This utility program is not an official Microsoft release, and is not supported by Microsoft.
  4. This code does not contain production-level error and exception handling. However, since it does not connect to your database, and since it only writes a small text file to the file system, there is little risk to your computer.

Suggestions for enhancement

If the utility connected to the instance of SQL Server, it would be possible to add the following features:

  • Constrain the types of documents that can be selected to those that are supported by installed filters. You can get the list of supported document types by querying the catalog view sys.fulltext_document_types.
  • Constrain the list of databases to existing user databases.
  • Constrain the list of existing search property lists by querying the catalog view sys.registered_search_property_lists.
  • Constrain the list of available properties to those that have not yet been added to the existing search property list. You can get the list of registered properties by querying the catalog view sys.registered_search_properties.
  • Provide the option to run the generated script on the server.

 


 

Thank you for your interest.