Creating Pivot Tables, Updated

If you tried the PowerTools Core code for pivot tables, it may not have worked. I made a mistake in the code that did not cause a problem in Excel 2007, but was a big problem in Excel 2010. In this article, I am going to explain the error in detail, how I found it and why it works differently in the two versions. Even though this turned out to be a small problem, I think it is a great example of the kinds of problems that can come up when developing with Open XML.

I first got a report that the strings were not appearing in the source worksheet for the pivot table. I looked at the XML that I was generating and found the following:

 

  <sheetData>
   
<row r="1">
     
<c r="A1" t="inlineStr">
       
<is>
         
<t xmlns="">Year</t>
       
</is>
     
</c>
     
<c r="B1" t="inlineStr">
       
<is>
         
<t xmlns="">Quarter</t>
       
</is>
     
</c>

 

At first glance, this might look fine. The cell type “inlineStr” specifies its value using the “is” element and that element contains a “t” element. The problem is the namespace. That bit that says xmlns=”” indicates that the namespace of that element is not the default. The “c” and “is” elements are in the default namespace for the spreadsheet (http://schemas.openxmlformats.org/spreadsheetml/2006/main).

For those not that familiar with namespaces in XML, here is a short explanation. The root element of the XML file may contain multiple namespace definitions. Here is the worksheet root element:

 

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
 
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">

The “xmlns” attribute defines the default namespace. The “xmlns:r” attribute defines the “r” namespace. If I want to use an element in the “r” namespace, I need to add that prefix. For example, for the “id” name in the “r” namespace, I would see “r:id” in the XML. However, the code does not use that string for elements and attributes. In the PowerTools Core code, there is a whole lot of definitions of XML names (XName objects) and namespaces (XNamespace objects) that are used for attributes and elements. These can be found in the PtOpenXmlUtil.cs” file. Here is a little of the “S” class:

 

    public static class S
   
{
       
public static XNamespace s =
           
"http://schemas.openxmlformats.org/spreadsheetml/2006/main";
       
public static XName c = s + "c";
       
public statiec XName cacheField = s + "cacheField";
       
public static XName cacheFields = s + "cacheFields";

 

The first item defines the namespace that is then used for all of the names defined for that namespace. (Note: I am making additions to these all the time. It would be a big task to go through the standard and put in every name definition. I just add them as I need them.) Then an name in that namespace uses the XNamespace object as part of its definition.

There is one more complexity. Although the element names are all in some namespace, the attributes are usually not in any namespace. For example, in the XML above, the following elements are in the spreadsheet namespace: row, c, is and t. The following attributes are not in any namespace: r and t. At this point, you may have noticed that “t” appears in both lists. That was my error. I used the no-namespace version of “t” for both of those. Here is a little of the NoNamespace class from PtOpenXmlUtil.cs:

    public static class NoNamespace
   
{
       
public static XName a = "a";
       
public static XName accentbar = "accentbar";
       
public static XName adj = "adj";
       
public static XName adjusthandles = "adjusthandles";

Notice that these are different because they have no namespace in the XName definitions. Here is the incorrect code to create a string cell:

newCell = new XElement(S.c, new XAttribute(NoNamespace.r, cellReference),
   
new XAttribute(NoNamespace.t, "inlineStr"),
   
new XElement(S._is,
       
new XElement(NoNamespace.t, value.ToString())));

The new XAttribute for NoNamespace.t is correct, but the new XElement for NoNamespace.t is not. Instead, it should be:

newCell = new XElement(S.c, new XAttribute(NoNamespace.r, cellReference),
   
new XAttribute(NoNamespace.t, "inlineStr"),
   
new XElement(S._is,
       
new XElement(S.t, value.ToString())));

With that change, the XML changes to the following:

  <sheetData>
   
<row r="1">
     
<c r="A1" t="inlineStr">
       
<is>
         
<t>Year</t>
       
</is>
     
</c>
     
<c r="B1" t="inlineStr">
       
<is>
         
<t>Quarter</t>
       
</is>
     
</c>

Why does the original XML work on 2007 and not 2010? This is a bit of a mystery. The original XML should not have been recognized in either version. It has the wrong element for the text. It could be that 2007 uses an XML reader that wasn’t very specific about namespaces. It could be that the code that processes the XML in 2007 is only looking at names and not namespaces. In any case, it does demonstrate that there can be differences in processing on different versions of Excel. If your development needs to support multiple versions of Excel, it is best to test on all of those versions.

A couple of people suggested that changing the type from “inlineStr” to “str” also worked. The “str” type is a bit of mystery to me. The standard says that it specifies a “formula string,” but I can’t quite figure out what that would be. Actual formulas are handled differently. If anyone knows what “str” is used for in Excel, I’d be interested to hear what it is.

I hope you found this detailed explanation interesting and useful. I have updated the PowerTools Core 2.2.2 with this correction, so it should be working fine in Excel 2010 now. Thanks for the feedback that allowed me to make this correction quickly.