Using a WordprocessingML Document as a Template in the Document Generation Process

In this post, I examine the approaches for building a template document for the document generation process.

This post is the second in a series of blog posts.  Here is the complete list: Generating Open XML WordprocessingML Documents Blog Post Series

In my approach to document generation, a template document is a DOCX document that contains content controls that will control the document generation process.  The document template designer can format this document as desired, and the document generation process will generate documents that have the format of the template document.

When working with content controls, first of all, remember that you need to turn on the developer tab in the ribbon.  Click File => Options => Customize Ribbon, and then turn on the developer tab:

Turning on the Developer Tab

Turning on the Developer Tab

Another point that will make it easier to work with content controls is to turn on design mode.  If design mode is turned off (which is the default), content controls have a square boxed appearance with a tab at the top that contains the title of the content control:

Content Control - not in Design Mode

Content Control - not in Design Mode

This is not a problem, except that if the focus is not in a content control, there is no visual indication that the content control is there.  Instead, turn on design mode:

Turning on design mode

Turning on design mode

With design mode turned on, content controls have blue tags that indicate the beginning and end of the location of a content control.  With design mode turned on, a template document will look something like the following:

Sample template document with content controls

Sample template document with content controls

In this document, plain text content controls contain a LINQ query that returns a single value.  Formatting is easy – the value returned by the query takes on the formatting of the containing run or paragraph.

In this document, the rich text content control with Table as its title contains a LINQ query that returns a collection of anonymous types.  The results of the query will be inserted into the document as a WordprocesssingML table.  The inserted table will have the formatting of the empty table that is inserted into the rich text content control.

Other uses of the word ‘Template’ in Microsoft Office

One minor issue around the idea of creating a template WordprocessingML document is that the term ‘template’ is overloaded.  Microsoft Word has the notion of ‘Document Templates’, which are saved with the dotx extension.  These are WordprocessingML documents with one special characteristic – when the user opens one of these documents, the user cannot directly save back to the dotx file – the user must instead supply a new filename, and Word will append docx as the extension.

In addition, related to dotx document templates are ‘document template projects’ in Visual Studio 2010 (and 2008).  These are template-based document-level projects (see Architecture of Document-Level Customizations) that consist of managed code that is attached to a document template instead of a document.  The user opens the template, uses the managed customization to do whatever it does, and then saves as a docx document.  The docx document can have a managed customization, or it can be stripped of the customization, leaving a plain old docx.

For this document generation project, we don’t need to use either of these facilities.  Instead, the template document that the designer creates is, as far as Word is concerned, an ordinary word-processing document.

!!!

4 Comments »

  1. Svetlin said,

    January 26, 2011 @ 10:09 pm

    Thanks for the post Eric!

    I took very similar approach back in mid 2008 when I first got involved in this project. But, due to requirements constraints I had to do some extra work.

    I had to provide an interface(as part of a bigger system) for an end-user to design the template in a drag-and-drop fashion(similar to SQL Reporting). So I end up hosting MS Word 2007 in a custom control inside the bigger app. That control also has a section that provides a list of all available fields from the system, as well as a “map” of the fields that are already in the document for editing their properties.

    I didn’t use docx as such, but the flattened xml form of it. I feel more comfortable working with XML, and at the time the OpenXML API v2 was still CTP, so I had to build one myself, not as comprehensive of course, but enough to help me with the processing.

    My initial design was also set around content controls, but I converted to CustomXml elements(I didn’t know about the lawsuit at the time) for the following reasons:
    1st – I was storing an XPath string in the tag property, which has a limitation of 64 characters in length.
    2nd – CustomXml allowed me to extend the metadata by just adding an attr element with a name and value attributes without “polluting” the template presentation and scare the user off.
    3-rd – As an unintended consequence the users much preffered the purple color of the CustomXml elements.

    I have 4 types of “placeholders”:
    1. FIELD – single value.
    2. REPEATER – for processing a list of objects. The XPath maps to multiple XML elements of the same type. When processing, every bit of content that is inside it, will be repeated for the number of elements. The content can be anything from just text, FIELD, another(nested) REPEATER, CONDITIONAL field, TABLE. In the case of a table I’m dong similar to what you are doing. The difference is that in my case the user can decide which fields he wants outputted.
    3. CONDITIONAL – a placeholder that at run/processing time will evaluate an expression tree and if true, then everything that is inside it will be processed. Again, same as REPEATER any content can go inside.
    4. PROMPT – this is a manual input single value field for some documents – typically ad-hoc letters that are not going to be batch server generated.

    Well, that is for now. It’s past midnight on this side of the world, so let me go and have some sleep.

    I will appreciate if you comment.

    Thank you !

  2. Eric White said,

    January 26, 2011 @ 11:32 pm

    Hi Svetlin,

    That is a great write-up with interesting ideas!

    The idea of an expression tree that contains a predicate is interesting. I think this could be handled also using a content control that contains an expression as well as other content. This might be a great use of nested content controls – within a conditional content control, there is a content control that contains the predicate, and another content control that contains the conditional content.

    I had considered the idea of a PROMPT content control. One place that this would be interesting is in SharePoint, where after assessing the document for PROMPT content controls, you create an InfoPath form and request responses to prompts. Alternatively, you create a Web part that requests responses to prompts. One more advantage of the PROMPT idea is that the user can specify content that is inserted at multiple places in the document. An example is a legal contract where a PROMPT content control is repeated at multiple places in the template document, yet is asked only once of the user. The user’s response is inserted at all appropriate places in the document.

    I have some additional ideas around the content controls that create tables. I’ll take another stab at a more elaborate template document that includes more capabilities – prompts, conditional, and a new approach to creating tables.

    -Eric

  3. Document Generator – Redefine your productivity!Document Generation Software said,

    March 1, 2013 @ 7:58 pm

    [...] achieve scalable performance without the dependency on other forcing environments. Thus, intuitive documentation becomes possible for every user effortlessly. Share this: This entry was posted in Document Generator by DocGenDoc. Bookmark the [...]

  4. Leonel said,

    October 23, 2014 @ 9:03 pm

    Thanks for finally talking about >Eric White’s Blog

RSS feed for comments on this post · TrackBack URI

Leave a Comment