Screen-cast: Using Open XML Spreadsheets as a Data Source

One of the interesting scenarios around Open XML is that of extracting data from a spreadsheet.  The ability to programmatically retrieve the data from a spreadsheet means that we can use Excel as a powerful tool for maintaining data, and then drive interesting processes from that data.

Excel automation is an approach where you use VBA to access the Excel interop programming interface.  For an introduction to using VBA to automate Excel, see Getting Started with VBA in Excel 2010.   It certainly is easy enough to use Excel automation to retrieve spreadsheet data, but this brings with it several issues:

Instead, another approach is to directly access the data using the Open XML SDK.  Some time ago, I wrote some code that makes it easy to query Excel spreadsheets using Linq.  Using that code, you can write concise code to retrieve all of the data in various sheets or tables in a workbook.

A real-world problem for me recently was how to generate some fairly involved HTML that would present a good user interface to Open XML developers who are searching for content that will help them solve certain Open XML developer problems.  I was putting together the Open XML developer content wiki, and I wanted to automate the building of pages that would present Open XML content by keyword, and Open XML content by author.  I used the LtxOpenXml module to query a spreadsheet that contains the list of content, and generate the HTML for those pages.  The following screen-cast explains this scenario in detail, explains the code, and demos the code.  The code and the spreadsheet are attached to this blog post.

Download: ContentListToHtml.zip