Generate Open XML Presentations using a Presentation Template
One of the most effective ways to build document generation systems is to enable creation of a template document. Generated documents are a product of the template document and some data or code that replaces certain key portions of the template document with content that is specific to the generated document. WordprocessingML documents can contain content controls, which allow us to delineate the content to replace. However, PresentationML does not have content controls. This post presents an approach for building template presentations that you can use to generate custom presentations.
To understand the approach, watch the following screen-cast, which explains how to create a template presentation, and how to replace delineated content with custom content:
The gist of the approach is that instead of using content controls, you use a specific character sequence to delineate the content. In the example that I present here, I use the <# and #> sequences to delineate replacement content. The following sample slide shows what a template slide looks like.
The biggest problem around processing these special character sequences is that paragraphs are often broken into runs, and we can’t control where PowerPoint 2010 will create the break between runs. This means that if we are searching for the sequence “<# SomeKeyword #>”, we almost certainly will need to search across runs. In the following slide, you can see that the <# are in a run with the word “About”. The #> are in a run by themselves. The keyword “AccountRep is in a run by itself. The key point is that we don’t know how these runs will be split up.
<a:p>
<a:r>
<a:rPr lang="en-US"
dirty="0"
smtClean="0"/>
<a:t>About <# </a:t>
</a:r>
<a:r>
<a:rPr lang="en-US"
dirty="0"
err="1"
smtClean="0"/>
<a:t>AccountRep</a:t>
</a:r>
<a:r>
<a:rPr lang="en-US"
dirty="0"
smtClean="0"/>
<a:t> #></a:t>
</a:r>
<a:endParaRPr lang="en-US"
dirty="0"/>
</a:p>
The solution is to break all runs up into multiple runs, each having a single character. After breaking runs, the markup will look something like this (shortened for brevity):
<a:p>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
smtClean="0" />
<a:t>r</a:t>
</a:r>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
smtClean="0" />
<a:t> </a:t>
</a:r>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
smtClean="0" />
<a:t><</a:t>
</a:r>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
smtClean="0" />
<a:t>#</a:t>
</a:r>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
smtClean="0" />
<a:t> </a:t>
</a:r>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
err="1"
smtClean="0" />
<a:t>C</a:t>
</a:r>
<a:r>
<a:rPr
lang="en-US"
dirty="0"
err="1"
smtClean="0" />
<a:t>u</a:t>
</a:r>
<a:endParaRPr
lang="en-US"
dirty="0" />
</a:p>
After breaking into runs where each run has a single character, it is much easier to look for the pattern that starts with <# and ends with #>. It then is straightforward to replace the sequence of runs that match the pattern with a new run with the generated content. Then, after processing the slide, it is easy to coalesce adjacent runs.
Example – Download Code