Search and Replace Text in an Open XML WordprocessingML Document

Searching and replacing text in an Open XML WordprocessingML document is not awfully difficult, but there are a few issues that complicate the process.  The text that you are searching for may span multiple runs, so the algorithm that searches for text needs to take this into account.  The replacement text should have the character formatting of the first character of the string in the document that matches the search string.  I’ve written a blog post at OpenXMLDeveloper.org that presents the algorithm, as well as example code written using XmlDocument (Microsoft’s implementation of XML DOM).  In addition, I’ve recorded a screen cast that explains the algorithm:

Walks through an algorithm to search and replace text in an Open XML WordprocessingML document.

!!!

21 Comments »

  1. egheorghe said,

    May 19, 2011 @ 3:07 pm

    Eric,

    We have a large number of VBA macros in Word. We would like to preserve some of this code, while running it on the server, we hope, with minimum changes. Is this possible ? Any suggestion as of where to find someone to help with such a project ?

    Thanks,

    Eugen

  2. Eric White said,

    May 20, 2011 @ 2:24 pm

    Hi Eugen,

    Take a look at [MS-OFFMACRO]: Office Macro-Enabled File Format Specification, which details how macros are stored in IS29500 (Open XML). Also take a look at [MS-OVBA]: Office VBA File Format Structure Specification, which documents the internal structure of the binary parts that store the macros.

    I’d be happy to chat with you about the project. I’m not certain that I could help, but after talking, I would have a pretty good idea if I could or not. Feel free to contact me – eric at ericwhite.com, and we’ll set up a time to talk.

    -Eric

  3. NotEricWhite said,

    May 31, 2011 @ 7:51 pm

    What resteraunt are you at while you made this video?

  4. Eric White said,

    June 23, 2011 @ 8:09 pm

    A very good one – was having butter naan with chutney, if I remember correctly. 🙂

  5. Thomas said,

    June 7, 2011 @ 9:42 pm

    It is possible to replace the text with a content control instead of other text?

  6. Eric White said,

    June 8, 2011 @ 1:28 am

    Sure! Instead of inserting the run that contains the replacement text, insert a content control that contains the run with the replacement text (or whatever you want the content control to contain).

  7. Thomas said,

    June 8, 2011 @ 1:39 pm

    I can sure use some help in trying to accomplish this. Should I insert my code here or is there a better place to post code and questions?

  8. Eric White said,

    June 10, 2011 @ 8:00 am

    Hi Thomas,

    In module SearchAndReplace.cs, you will need to update the code from line 137 to line 156. That is the code that creates the new run that is inserted as replacement text. You will need to create the content control (w:sdt). In addition, you will need to decide what will go into the content control. If you want to insert the replacement text, you can use the existing code that creates a run. You can insert that code into the contents (w:sdtContent) of the content control.

    When possible, I’ll create an example.

    -Eric

  9. mir azam ali said,

    May 15, 2012 @ 2:30 pm

    iam geting a inner text like this
    “Advanced Enterprise SolutionsAP322 Configuration Design document ” i want to relace the text “Advanced Enterprise testAP322 Configuration Design document ” but this was not working if i give the repacement text as “Advanced Enterprise test”

  10. Mort said,

    June 20, 2011 @ 11:03 pm

    Hi Eric,
    The search and replace function doesn’t seem to work for text that is inside a textbox.
    Mort

  11. Eric White said,

    June 21, 2011 @ 6:08 pm

    Hi Mort,

    I’ve fixed and updated the code in the post on OpenXMLDeveloper.org.

    -Eric

  12. Kevin said,

    June 23, 2011 @ 4:42 pm

    I am looking to adopt OpenXML and DocumentBuilder for my reporting solution in our company’s application. I need to use the Search and Replace Text shown here but I am confused about how to get the two ideas to work together. Can you provide an example or can the Search and Replace Text be integrated into DocumentBuilder?

  13. Eric White said,

    June 23, 2011 @ 7:26 pm

    Hi Kevin,

    You could search and replace text in documents before feeding them to DocumentBuilder, or you could construct a new document using DocumentBuilder, and then search and replace in the resulting document. How do you need to integrate search and replace with DocumentBuilder? Maybe you could say more about your scenario?

    Also, I’d really love to hear about your use of Open XML and DocumentBuilder – if appropriate, would like to discuss in a blog post.

    -Eric

  14. mir azam ali said,

    May 15, 2012 @ 2:35 pm

    this algorithim is not workning for the format of if we replace “the markup” with “the sute”

    See

    the

    markup.

    test.

  15. Karlyn Rough said,

    June 1, 2012 @ 1:56 pm

    Hey There. I found your blog using msn. This is a really well written article. I’ll make sure to bookmark it and come back to read more of your useful info. Thanks for the post. I will certainly return.

  16. Tess Dickles said,

    February 17, 2013 @ 10:36 pm

    Thank you for the well written and informative article. This saved me weeks worth of work on a C# project on which I’m working.

    One quick question about find and replace that I’ve been stuck on:

    Is it possible to replace a word in an OpenXML document with two words separated by a line break?

    For example –
    original word=”ReplaceThisWord’
    new word=”Word\r\n\Replaced”

    ReplaceThisWord

    Word
    Replaced

    Thanks again for the work you’ve shared!

  17. Eric White said,

    February 18, 2013 @ 6:52 am

    Hi Tess,

    I don’t have a good example for how to do this. This is a great idea though. I’ve added this to my list of content to write or record.

    -eric

  18. Andy said,

    December 14, 2013 @ 11:21 pm

    Hi Eric,

    Just wondering if you managed to add the “Newline” functionality Tess was talking about? This is something that would really help me out!

    Cheers

  19. Eric White said,

    December 15, 2013 @ 3:05 am

    I haven’t yet had a chance to get to it. However, I’ll put it at the top of the list. Should be able to attend to this in the next month or so.

  20. Mohamed Ali Khan said,

    March 22, 2014 @ 12:05 pm

    Hi Eric,
    What about Searching and Replacing the text only once instead of all occurances(matches).

    Ex:

    abc,def abc,abc,def,ghi

    when we search and replace the word ‘abc’ with “ERIC”.

    we get ERIC.def ERIC,ERIC,def,ghi. Which is fine.

    But i require to stop replacing after first replacement, How do we do this…

    Desired output for the same example.

    ERIC,def abc,abc,def,ghi

    Help me resolve this!

    Thanks in advance.
    Mohamed.

  21. Vishal S said,

    August 24, 2016 @ 1:29 pm

    Hi Eric,

    Bullz Eye… Thanks a lot! Got working solution after 3 days of searching and trying. It cannot be better than this.

    Thanks.

RSS feed for comments on this post · TrackBack URI

Leave a Comment