Apr
28
2009

HTML Table Destination: An (incomplete) SSIS custom data flow component

I had fun recently creating a custom data flow destination for Integration Services - an HTML Table Destination that writes out the data from the data flow to an HTML file as an HTML table.

No applause, please - I copied SSIS developer Matt Masson's code from Codeplex to get started, including his handy helper functions for the tedious stuff like adding custom properties and caching input columns. Plus, it's not rocket science to output tabular data to a well-known text-based markup format.

You too could be coding custom components for SSIS! Browse the source code on Codeplex, and read the programming chapters in Kirk Haselden's book.

The HTML Table Destination uses a File connection manager to assign the output file. Its only custom property is the title to give the HTML table:

Here's a sample of what the output looks like, with data from AdventureWorks:

The main run-time logic is relatively simple:

  • PreExecute - Initialize a StreamWriter with the file path from the connection manager, cached from the AcquireConnection method. Write out the HTML tags that appear once at the top of the document, including the title as an H1 heading taken from the single custom property. Open the HTML table element. Write out the column headings.
  • ProcessInput - For each row of data, write out each value in a separate table cell. For null values, write out a non-breaking space ( ), or the empty cell has no border.
  • PostExecute - Close the HTML table element and write out the HTML tags that appear once at the bottom of the document. Close the StreamWriter.

So this is the only interesting code, shamelessly devoid of error-handling:

    Public Overrides Sub PreExecute()

        ' Get custom property values.
        _tableTitle = GetCustomPropertyValue("TableTitle").ToString()

        ' Initialize the StreamWriter.
        htmlFileWriter = New StreamWriter(_outputFile)

        ' Begin HTML document.
        With htmlFileWriter
            .Write("<html>")
            .Write("<head></head>")
            .Write("<body>")
            .Write("<h1>" & _tableTitle & "</h1>")
            .Write("<table border=""1"">")
            .Write("<thead>")
        End With

        ' Cache all of our input column information.
        Dim input As IDTSInput100 = ComponentMetaData.InputCollection(0)

        _inputColumnInfo = New List(Of InputColumnInfo)(input.InputColumnCollection.Count)

        For Each column As IDTSInputColumn100 In input.InputColumnCollection

            ' Cache column information.
            Dim info As New InputColumnInfo()
            info.Name = column.Name
            info.DataType = column.DataType
            info.ID = column.ID
            info.Index = BufferManager.FindColumnByLineageID(input.Buffer, column.LineageID)

            _inputColumnInfo.Add(info)

            ' Write column header into HTML table.
            htmlFileWriter.Write("<th>")
            htmlFileWriter.Write(column.Name)
            htmlFileWriter.Write("</th>")

        Next

        htmlFileWriter.Write("</thead>")

    End Sub

    Public Overrides Sub ProcessInput(ByVal inputID As Integer, ByVal buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)

        Do While buffer.NextRow()

            ' Begin HTML table row.
            htmlFileWriter.Write("<tr>")

            For Each col As InputColumnInfo In _inputColumnInfo

                ' Begin HTML table cell.
                htmlFileWriter.Write("<td>")

                Dim value As String = col.GetColumnValue(buffer).ToString()
                If Not String.IsNullOrEmpty(value) Then
                    htmlFileWriter.Write(HttpUtility.HtmlEncode(value))
                Else
                    htmlFileWriter.Write("&nbsp;")
                End If

                htmlFileWriter.Write("</td>")

            Next

            htmlFileWriter.Write("</tr>")

            ' Add to perf counter information.
            ComponentMetaData.IncrementPipelinePerfCounter(DTS_PIPELINE_CTR_ROWSWRITTEN, 1)

        Loop

    End Sub

    Public Overrides Sub PostExecute()

        With htmlFileWriter
            .Write("</table>")
            .Write("</body>")
            .Write("</html>")
        End With

        htmlFileWriter.Close()

    End Sub

The source code is attached to this blog posting. I'm sure you can think of many ways to enhance it...for example, by adding custom properties for the table width, or for various CSS styles.

Here are some more disclaimers about this custom component...

  • This component is not complete. In particular, I have not implemented adequate validation and error-handling. I am providing source code only to emphasize that this is a "proof of concept" and not production code.
  • I am using Visual Studio 2008 and SQL Server 2008 Integration Services. For the most part, 2008 custom components should be easy to adapt for 2005 by changing the assembly references, and any instance of "IDTSxxx100" to "IDTSxxx90".
  • I am a self-taught amateur programmer. It shows.
  • I typically still program in Visual Basic .NET. Neener neener.

I hope that someone finds this useful!

-Doug

HtmlTableDestination.zip (14.5 kb)

Comments (4) -

Vincent Wylenzek

Doug,


This could be useful I guess. Thanks for sharing.


grtz,


Vincent


Bill Ede

This has been extremely helpfull - the project I am working on is totally different but the principles and objects/methods for looping through the columns is what I needed to know and have had great difficulty in finding. I am having to upgrade a very robust tried and tested DTS ActiveX script to SSIS.


Victor Song

this is cool, I just need it.


CozyRoc

Check CozyRoc's Template Task: http://www.cozyroc.com/ssis/template-task">www.cozyroc.com/.../template-task


You can generate arbitrary type of text document without any programming. The generation is based on a template and the template is processed by engine based on the open-source Velocity engine.


The type of input that can be specified are: direct input, file connection manager, variable and data flow destination.