Release of V2 of Doc Gen System: XPath in Content Controls

Today I’m posting the release of version 2 of my simple document generation system.  In this example, you configure the document generation process by creating a template document that contains content controls.  You then enter XPath expressions in those content controls.  Those XPath expressions specify the data that the document generator pulls from the source data.  The source data is an XML document that contains data for each and every document that you generate.  The source XML document can also contain detail (children records) that populate tables in the generated document.  I detailed how the template document works in the post Generating Open XML WordprocessingML Documents using XPath Expressions in Content Controls.

This post is the 14th in a series of blog posts on generating Open XML documents. Here is the complete list: Generating Open XML WordprocessingML Documents Blog Post Series

Download: Generate Open XML WordprocessingML Documents using XPath Expressions in Content Controls

In my opinion, the use of XPath expressions in content controls is a superior approach to the one of entering C# code in content controls.  The code is cleaner and smaller (this first example is less than 240 lines of code).

I’ve recorded a short (2 minute) screen-cast that demonstrates this example in action.

Demonstrates the XPath-in-Content-Controls approach to document generation

So please download the example, try it out, and give me feedback.

!!!

38 Comments »

  1. Robert Nattenberg said,

    March 30, 2011 @ 5:56 am

    Hi Eric,

    If I’m not mistaken, you’re the Eric White I worked with about 20+ years ago when I was at Unisys when I was working on XVT.

    If I am mistaken, pardon me. If I’m not, I hope all is well with you.

    Best,
    Bob

  2. Alternate OOXML Document Generation Approach « Coding The Document said,

    March 31, 2011 @ 7:56 pm

    [...] White has put out a document generation example which uses XPath and Word Content Controls.  I applaud Eric for the amount of work he has done [...]

  3. Mike Brennan said,

    April 1, 2011 @ 8:27 pm

    Sometimes forms do not fall into computationally convienent patterns (Low Volume High Complexity (LVHC) #8).
    Let me suggest a significant and potientially valuable enhancement to your example from the LVHC viewpoint.
    - Consider adding a column to the table that is a single character such as a taxable flag, promotional flag, discounted flag, etc.
    - Setup the table such that the output will always be narrow, only one or two characters wide.
    - Determine how to support the column with an expression that is much wider than the column output format will permit.

    When solved with a mapping approach, such as using something in place of the xpath expression that maps to an xpath expression, a great deal of generality is added to the solution. This kind of approach can also lead to separation of the ‘user view’ of the template fields from the ‘technical view’ of the template fields.

    I like the elegance of your solution and hope you can find a way to include this enhancement.
    - Mike

  4. Eric White said,

    April 1, 2011 @ 10:30 pm

    Great idea! This could be done with something like *1 or ~1 that maps to XPath expressions in the Config content control. I’ll have to do a bit of research to make sure that the starting character is not allowed in XPath expressions.

    -Eric

  5. Eric White said,

    April 1, 2011 @ 10:35 pm

    Hi Bob, yes, you are right! Same Eric White. Good to hear from you.

    -Eric

  6. Srinath said,

    April 4, 2011 @ 3:03 pm

    I didn’t find any licensing information in the downloads. I find the code highly instructional and in case, I wanted to use this as a starting point for my project, just wanted to be sure it’s okay.

  7. Eric White said,

    April 7, 2011 @ 2:00 pm

    Hi Srinath,

    Yes, go for it. At some point in the future I will be releasing licensed code but this is a simple 240 line example. Please use it.

    -Eric

  8. Jayadev Thimmaraju said,

    April 20, 2011 @ 1:58 pm

    hi Eric,

    I have been asked to develop a custom proposal management solution(output is word,pdf,excel) using Microsoft technology platform. I came across your site. Could you please advise, if I can go ahead with the Open XML approach and use your code, will Dell have any licensing issues if I develop a prototype and I need to move this code to production

    Also, do we have any thirdparty tools which generates word documents on servers

  9. Eric White said,

    April 22, 2011 @ 6:16 pm

    Hi Jayadev,

    I will release information on licensing in the next 2-3 days. I intend to release this code using the Microsoft Reciprocal License (Ms-RL). Will this license serve your purposes?

    -Eric

  10. Eric White said,

    April 23, 2011 @ 4:51 am

    Also, there are a plethora of third-party server-side tools for document generation, ranging from Office automation solutions (bad) to a wide variety of Open XML solutions (good). This includes, of course, the Open XML SDK productivity tool and document reflector.

    -Eric

  11. Jayadev Thimmaraju said,

    April 25, 2011 @ 8:58 am

    Thank you Eric for your response. Appreciate it. I will wait for your updates on licensing.

  12. Jayadev Thimmaraju said,

    April 26, 2011 @ 4:19 am

    Thanks Eric. I will wait for your updates on License release. Appreciate your help.

  13. Tom said,

    June 21, 2011 @ 8:15 pm

    Hi Eric,
    I’ve been looking to implement something similar to this for some time.
    Unfortunately I’m not experienced in C# and I’m having a hard time converting it to VB. Are you able to make a release in VB as well?

    Thanks,

    Tom

  14. Eric White said,

    June 22, 2011 @ 12:19 am

    Hi Tom, that code can be converted to VB – it will require some fairly deep knowledge of LINQ to do so. Have you gone through my functional programming tutorial for VB? http://blogs.msdn.com/ericwhite/pages/fp-tutorial-vb.aspx

    Converting that code to VB is a great idea, but at the moment, I am buried in other projects.

    One thing you may consider is to just use it in C#. It is easy enough to mix C# and VB in the same project.

    -Eric

  15. Tom said,

    June 22, 2011 @ 2:34 am

    Hi Eric,
    I appreciate the reply.
    I’m a newbie to LINQ which is where I was having problems with the conversion. Everything else was converting across fine except for the LINQ queries.
    I went through a couple of different converters and managed to find one which did it perfectly. I’ll use the converted code to teach myself LINQ as I build on the solution.

    Thanks for the starting point!

  16. Kevin said,

    July 18, 2011 @ 6:34 pm

    I have a project that I am using this technique for substituting values from my business objects into a document (I will call these merged documents). My application uses DocumentBuilder 2.0 to assemble my final document based on decisions in the application using several Word2010 documents. My problem is that DocumentGenerator writes a document to the physical disk. Is it possible to create resulting documents from DocumentGenerator as WmlDocument without having to write a document to the physical disk?

  17. Eric White said,

    July 18, 2011 @ 7:21 pm

    Hi Kevin,

    I haven’t put together a version that generates the documents in-memory. It would not be hard to do, but I hadn’t considered a case where this would be useful. DocumentGenerator could return a list of WmlDocument objects. I’ll put this on my list of things to do…

    -Eric

  18. chandramouli said,

    September 2, 2011 @ 1:41 pm

    Dear Mr. Jayadev:
    I have been trying to get your contact numbers for a quite some time. We spent some time at your house also in chennai.
    Kindly mail to the following email id
    chandravmouli@rediffmail.com
    moulivsastry@yahoo.co.in

  19. Kristen said,

    October 25, 2011 @ 5:57 pm

    Any chance you could send me the zip file (http://cid-5e385848af211ba9.office.live.com/self.aspx/11-03-08-Doc-Gen/11-03-24-Gen-Docs-XPath.zip)? For some reason, I am not able to download it.

    Also, I’m wondering if you can tell me if the Express version of Visual Studio will suffice?

    Thanks so much,
    Kristen

  20. Kevin said,

    January 11, 2012 @ 8:51 pm

    My application uses DocumentBuilder 2.0 and OpenXMLPowerTools to assemble my final document based on decisions in the application using several Word2010 documents.

    When doc1.docx has a numbered list (with letters A. through G. or numbers 1. through 8.) and doc2.docx is a continuation of the same list (so the first numbered paragraph should be H or 9).

    When I add the second document, I use the Source(WmlDocument source, bool keepSections) constructor and keepSections is always false.

    Is it possible to do a continuous list? Thanks for your input.

  21. Stuart Rivenbark said,

    January 23, 2012 @ 3:10 pm

    I am having trouble downloading the sample. The link http://cid-5e385848af211ba9.office.live.com/self.aspx/11-03-08-Doc-Gen/11-03-24-Gen-Docs-XPath.zip does not work.

    Could you please send the ZIp to the following two addresses. My work network often blocks ZIP file attachments.

    Please send to stuart.rivenbark.ctr@navy.mil and newbernrivenbarks@suddenlink.net.

    Thansk so much.

  22. Stu said,

    January 23, 2012 @ 6:12 pm

    Any chance you could send me the zip file (http://cid-5e385848af211ba9.office.live.com/self.aspx/11-03-08-Doc-Gen/11-03-24-Gen-Docs-XPath.zip)? For some reason, I am not able to download it.

    Please send to stuart.rivenbark.ctr@navy.mil and
    newbernrivenbarks@suddenlink.net

  23. Morrowyn said,

    March 7, 2012 @ 4:15 pm

    Hi,

    I tried your code, which works really fine. However when I add some content controls, some of them disappear in a Paragraph (w:p) . The code crashes horribly onto this.

    if (tag == “SelectValue”)
    {
    XElement run = element.Element(w + “sdtContent”).Element(w + “r”);
    string valueSelector = GetContentControlContents(element);
    string newValue = document.XPathSelectElement(valueSelector).Value;
    …..

    I added a check here if run == null and do the following to retrieve the text xelement

    XElement parent = element.Element(w + “sdtContent”);
    run = parent.Element(w + “p”);
    XElement cp = parent.Element(w + “p”);
    However I’m unable to replace the text of the paragraph. Does this involve the remove and re-add the paragraph trick?

    Regards and keep up the good work.

  24. Morrowyn said,

    March 8, 2012 @ 10:20 am

    Never mind, the following code does it for me:

    XElement parent = element.Element(w + "sdtContent");
    run = parent.Element(w + "p");
    if (run != null)
    {
    XElement cp = parent.Element(w + "p");
    run.Remove();
    // Add a new paragraph with the new value.
    XElement newParagraph = new XElement(w + "p", new XElement(w + "r", new XElement(w + "t", newValue)));
    parent.Add(newParagraph);
    return parent;
    }
    else {
    XElement t = (new XElement(w + "r", run.Elements().Where(e => e.Name != w + "t"), new XElement(w + "t", newValue)));
    return t;
    }

  25. Jan Vorster said,

    March 19, 2012 @ 1:17 pm

    Hi Eric,

    I’m trying to accomplish adding a numbered list inside the Word template document… I’ve managed to add a single level of numbering, but how would I go about achieving nested numbering (ie 1.1.1 ).. would I use nested repeaters? Is this even possible?

    Thank you
    Jan

  26. Chris said,

    October 3, 2012 @ 10:58 am

    Any chance you could share the application code that you’ve written for these article posts?

    The download link doesn’t seem to be valid any longer.

    Thanks,

    Chris

  27. Eric White said,

    October 4, 2012 @ 1:14 pm

    Hi Chris,

    Sorry about that – I was cleaning up my skydrive and was a bit too enthusiastic. I’ve updated the link in the post. Just to be sure that it is clear, here is the link:

    https://skydrive.live.com/redir?resid=5E385848AF211BA9!5731&authkey=!AOs_0PT_ICgWjcg

    -Eric

  28. Richard said,

    December 26, 2012 @ 4:12 pm

    Hello Eric,

    I’ve been looking at your post and would like to know how to use your recursive approach for 3 level of XML Nodes.
    I’ve created a Word document like yours and have embeded 3 levels of ContentControls (Repeat1, Repeat2, Repeat3 that aims to be the 3 sub levels of my xml node.)
    How can i perform such recursive?
    Thanks.
    Richard

  29. Eric White said,

    December 26, 2012 @ 6:24 pm

    Hi Richard,

    I believe it just works. I haven’t tried it, but as far as I know, it is coded so that it doesn’t care if you nest 3 deep.

    -Eric

  30. Javier Montoro said,

    January 8, 2013 @ 4:18 pm

    Hi Eric,

    I’m having lots of problems with sdt controls… First of all, if you add a (rich) content control (c.c. from now on) using word 2010, it doesn’t allow u to nest another c.c. inside. I’ve discovered that it’s because the w:placeHolder tag. If I change the xml itself, now it is possible to nest c.c. So, first question:
    - Is it possible to nest c.c. using word alone, without touching the xml?
    Now, if we remove the content of this c.c. edited by xml, click outside and inside again, ops, it is impossible to nest c.c. again! Word adds the placeHolder tag itself again… So here comes my second question:
    - Is it possible to block the tag placeHolder so it doesn’t appear anymore? Or disable it totally?
    So I started researching about grouping c.c, and I’ve seen that the group c.c. is always “Group”, and using Word I can’t change it. Yet, if I add a tag in the xml with another value, the group c.c. is no “Group” anymore, but the name I gave it in the xml, so ok. But, again, third question:
    - Can I “default” the name of the group c.c. tag so everytime I group c.c. the tag is my own?
    And one last thing. Playing with the group c.c., I see that it is possible to insert new c.c. inside… but if I destroy everything inside, click outside and inside again, ops, the same thing happens, now it is not possible to add new c.c., and the group c.c. is useless.

    Sorry for so big text with so many questions!
    Thanks in advance. Regards

    Javier Montoro
    Software Engineer at Agresso

  31. Jonathan Moran said,

    February 22, 2013 @ 4:29 pm

    Eric
    We had an inhouse developer create a docgen platform for us 10 years ago that we are still using today. I would like to have it updated. Can you recommend a software engineer or two? It is word based and we have about 20 documents that we use. It creates each document individually which is time consuming. We usually need about 5-8 documents at a shot. We would like to create 3-6 packages that would create all the documents at once.

    Thanks

  32. Dot NET document generator – modular business intelligence deployment.NET Document Generation said,

    February 22, 2013 @ 10:49 pm

    [...] create the first impression, they need to be perfect. With the .Net integration, it becomes possible to deploy a completely controlled system to achieve scalable models of documentation. These models are associated with the .Net engine and [...]

  33. Eric White said,

    February 23, 2013 @ 8:28 am

    Hi Jonathan

    I’d be happy to talk with you about your doc gen system.

    If your project will require more than one developer, I work closely with a friend of mine, Brant Stoede, who is a super developer and is competent in Open XML.

    You can send email to me at eric at ericwhite.com.

    -Eric

  34. Guido Gilli said,

    March 6, 2013 @ 1:50 pm

    thank’s for your post, the solution work fine in body section, but don’t work in headers and footers section, can you tell me how resolve this problem?
    Thank you

  35. Kevin said,

    March 13, 2013 @ 7:47 pm

    I have been using this technique (Creating a template document that contains content controls. Then enter XPath expressions in those content controls. Those XPath expressions specify the data that the document generator pulls from the source data.)

    This works for content in the body of the document, can you tell me how to do the same for the footer?

    Thanks.

  36. Kevin said,

    March 13, 2013 @ 7:51 pm

    I too have tried to use this technique in the footer of my document. It does not work. Please let me know if you find a solution or get a reply back from Eric White. Thanks.

  37. Mark Buckley said,

    August 30, 2013 @ 3:45 pm

    It appears to me that the ‘Table’ content control can only have ONE row, is this correct? In my XML data source file I have about 10 ‘child’ fields (products that each Company produces) that I need to display in a table format. The field data to too large to display all in one row. If I put 3 or 4 fields in one row my document is created fine. If I then go back and move one of the field to another row I get an error saying ‘Expression must evaluate to a node-set’. If I move the field back to it’s original location the document is created correctly.

    I guess what I’m trying to find out is if what I’m seeing is correct or am I doing something wrong.

    Thank you for any guidance you can give me. -MARK-

  38. Mark Buckley said,

    August 30, 2013 @ 7:47 pm

    So I have had a ‘DUH’ moment and figured out my own answer. The answer is to use the ‘Repeater’ control which makes perfect sense if you consider the normal .net data controls. The GridView or table has a set one row at a time layout, while the Repeater is much more flexible and allows for a free form layout whis is what I needed. Like I said ‘DUH’ !
    -MARK-

RSS feed for comments on this post · TrackBack URI

Leave a Comment