Text Templates (T4) and the Code Generation Process
As I was contemplating the process of generating the C# code that will do the document generation, I was drawn to the idea of using text templates, also known as T4. Text templates are a .NET code generation technology. I have never used text templates before, so I spent a few hours researching them to see their applicability in the Open XML WordprocessingML document generation process. The short version of this post is that I have decided against using text templates in this particular iteration of a document generation system. However, text templates are very cool, and have applicability in the Open XML document generation process. This post details my notes and thoughts on text templates, and gives my reasons for deciding against using them, although I am going to steal some ideas from them.
This post is the eighth in a series of blog posts on generating Open XML documents. Here is the complete list: Generating Open XML WordprocessingML Documents Blog Post Series
Text templates are a very cool technology (introduced in VS2008, I believe) that makes it easy to generate code as part of the application build process in Visual Studio. In addition, you can use text templates to generate files at runtime. You code text templates in a way that is similar to coding ASP.NET pages. Some portions of a text template contain literal text that is copied verbatim to the generated file, while other portions (similar to code blocks and expression holes) contain C# or VB code that you can use to programmatically generate portions of the generated file. For example, the following text template generates a simple XML document:
<#@ template debug="false" hostspecific="true" language="C#" #>
<#@ output extension=".xml" #>
<#@ assembly name="System.Xml" #>
<#@ assembly name="System.Xml.Linq" #>
<#@ import namespace="System.Xml.Linq" #>
<#
XElement e = new XElement("ChildElement", "with some data");
#>
<Root>
<#= e #>
</Root>
Here is the generated XML document:
<Root>
<ChildElement>with some data</ChildElement>
</Root>
There is a fair amount to learn about text templates. The lines that start with <#@ are directives. The assembly directives tell the text template to link with the specified assemblies. The import directive serves the same purpose as the using directive in a C# program. The code between <# and #> is executed when the template is evaluated. The line <#= e => serves the same purpose as an expression hole in other similar technologies.
The principle reason that I am not going to use text templates in this current effort is that I am writing a pure functional transform from a WordprocessingML document to a bunch of pure functional C# code that will generate a number of WordprocessingML documents. I would not be using the most powerful feature of text templates, which are expression holes, at least with the intent with which they were designed. The text template then becomes simply a mechanism to combine some boiler-plate code with some code generated by the functional transform. Pulling in the additional complexity of text templates doesn’t pay.
That said, text templates are interesting in the domain of Open XML document generation. Instead of approaching the problem as I am in this current series of posts, the approach using text templates would be generating a Flat OPC document. You can use LINQ to XML handily in text templates. The development of a doc gen system using text templates becomes one of finding the interesting markup in the Flat OPC document and writing some expression holes to generate the variable parts of the document. This is definitely going on my list of blog posts to write in the near future. It will be an easy post to write – perhaps only an hour or two will be required to build a rudimentary doc gen system (one with quite different characteristics from the one I’m currently building).
There are a few more interesting points to note about text templates. The most powerful and common use of text templates is to facilitate code generation from within a Visual Studio project. The text template is evaluated whenever it is saved, and the resulting generated C# or VB source file is compiled whenever you build the project. This allows you to generate code as part of the editing process, and then use the generated code from other modules seamlessly. This is super-interesting, but not really relevant to the problem of document generation. Document generation should ultimately be in the hands of the domain experts – the marketing folks, the customer relationship departments, and whoever has industrial-strength document generation requirements. We don’t want to require Visual Studio for the design process.
You can generate text files at run-time by using a pre-processed text template. There are limitations on what you can do with this approach. Effectively, you can define additional properties for the generated class behind the template. You can then use those properties in a non-dynamic way in the generated file. This is somewhat interesting, but doesn’t justify the additional complexity.
One feature of text templates is that you can write a ‘Custom Host’ that allows you to kick off the transformation process programmatically. There is an interesting note in the topic Processing Text Templates by using a Custom Host:
We do not recommend using text template transformations in server applications. We do not recommend using text template transformations except in a single thread. This is because the text templating Engine re-uses a single AppDomain to translate, compile, and execute templates. The translated code is not designed to be thread-safe. The Engine is designed to process files serially, as they are in a Visual Studio project at design time.
It is probably possible to design a robust document generation system using text templates, but you would have to take care to avoid any complications related to the above warning. You would also want to do a lot of testing at scale.
For more info about text templates, see Code Generation and Text Templates (T4).
shiva.k.k said,
August 8, 2014 @ 8:28 am
Hi Eric,
I have a generic question on manipulation of word document .
Not sure if this is the right place to check, but i didnt find any other link.
I am looking at a way of manipulating the word document, say identifying a paragraph or section or table using the extended attribute like id to get the entire word feature with the formatting and association.
Is this possible?