Replacing Text in Revision Tracked Document
Home › Forums › Open-Xml-PowerTools › Replacing Text in Revision Tracked Document
Tagged: Revision Tracking Text Replace
This topic contains 4 replies, has 2 voices, and was last updated by AlanSMac 8 years, 3 months ago.
-
AuthorPosts
-
August 23, 2016 at 12:18 pm #3648
Hi Eric,
I am replacing variables in docx templates with values in a Windows Service. To do this I use TextReplacer.SearchAndReplace. It turns out some of our customers have been using Revision Tracking with their templates and TextReplacer throws when it detects Revision Tracking elements. I’d like to handle this in our service.
I have tried removing the elements which ends up with an NullReferenceException:
System.NullReferenceException: Object reference not set to an instance of an object.
at OpenXmlPowerTools.PtOpenXmlExtensions.GetXDocument(OpenXmlPart part)Code:
private void ProcessRevisionTracking(WordprocessingDocument wordDoc) { //var revisionTrackingOn = RevisionAccepter.HasTrackedRevisions(wordDoc); //RevisionAccepter.P var revisionTrackingElements = wordDoc.MainDocumentPart.DocumentSettingsPart.Settings.Descendants<TrackRevisions>(); var revisionTrackingOn = revisionTrackingElements .Any(rte => rte.Val = new DocumentFormat.OpenXml.OnOffValue(true)); if (revisionTrackingOn) { logger.LogDebug("Revision tracking " + (revisionTrackingOn ? "detected" : "NOT enabled")); foreach (var rte in revisionTrackingElements) rte.Remove(); List<OpenXmlPart> parts = new List<OpenXmlPart>(); parts.Add(wordDoc.MainDocumentPart); parts.AddRange(wordDoc.MainDocumentPart.HeaderParts); parts.AddRange(wordDoc.MainDocumentPart.FooterParts); parts.Add(wordDoc.MainDocumentPart.EndnotesPart); parts.Add(wordDoc.MainDocumentPart.FootnotesPart); foreach (var part in parts) { ProcessRevisionTracking(part); } // wordDoc.MainDocumentPart.Document.Save(); logger.LogDebug(String.Format("Revision elements still found? - {0}", RevisionAccepter.HasTrackedRevisions(wordDoc))); } else logger.LogDebug("Revision tracking NOT detected"); } private void ProcessRevisionTracking(OpenXmlPart part) { //var matchNames = RevisionAccepter.TrackedRevisionsElements var revisionElements = part.GetXDocument().Descendants() .Where(desc => RevisionAccepter.TrackedRevisionsElements.Contains(desc.Name)) .ToArray(); logger.LogDebug(revisionElements.Length + " revision elements found to remove."); foreach (var elem in revisionElements) { elem.Remove(); } }
I came across this page https://msdn.microsoft.com/en-us/library/ee836138(v=office.12).aspx#AcceptRevisions_RemovingElements which sounds like I’d need to do some complicated collapsing versus removing. Do you have any advice on how to easily replace the text or is it a case of me coding everything mentioned in that article? I don’t need Revision Tracking in the output document (but can have it if it’s easier), but I would need all the normal document elements.
I was also looking at the source code of https://github.com/VisualOn/OpenXmlPowerTools/blob/master/RevisionAccepter.cs method public static bool PartHasTrackedRevisions(OpenXmlPart part). I don’t understand how some of those elements that don’t sound RevisionTracking related indicate tracking is in use. For instance is the existence of a W.cellDel really enough to signal revision tracking is being used and that element relates to that?
- This topic was modified 8 years, 4 months ago by AlanSMac. Reason: Code did not format correctly
August 23, 2016 at 1:46 pm #3650Hi,
You should be using OpenXmlRegex, not TextReplacer. OpenXmlRegex can do everything that TextReplacer can do, and a lot more, including replacing text in a document that contains tracked revisions.
http://www.ericwhite.com/blog/blog/openxmlregex-developer-center/
Cheers, Eric
August 24, 2016 at 6:06 pm #3656Hi Eric,
thanks for your reply. I am in the middle of trying to convert my code over to use OpenXmlRegex. I followed the link and also watched one of your YouTube videos about it and can’t get the replace to work despite the fact I think I am calling correctly. I can find matches with the regex but not replace the value. I am in the UK so at home now so will try again tomorrow when if the office. I think it might be because all my existing code was based on WordprocessingDocument and now I have had to call doc.MainDocumentPart.GetXDocument(); and manipulate that. Maybe it’s not persisting back and I have to save to the same stream?
public void ReplaceFirst(WordprocessingDocument doc, params KeyValuePair<string, string>[] kvps) { var xdoc = doc.MainDocumentPart.GetXDocument(); foreach (var kvp in kvps) { //OOXML library does not like null or empty string value = (kvp.Value == null || kvp.Value == string.Empty) ? " " : kvp.Value; logger.LogDebug("Applying value: [" + kvp.Key + "] " + value); //var content = doc.MainDocumentPart.Document.Body.Descendants<Text>(); var content = xdoc.Descendants(W.p); logger.LogDebug("Found " + content.Count() + " text elements to search"); //var regex = new Regex(VariablePrefix + kvp.Key + VariableSuffix); var regex = new Regex("contact"); logger.LogDebug(OpenXmlRegex.Match(content, regex) + " matches"); bool isFirstReplacement = true; OpenXmlRegex.Replace(content, regex, value, (xElement, match) => { if (isFirstReplacement) { isFirstReplacement = false; logger.LogDebug("Replaced match"); return true; } logger.LogDebug("Did not replace match"); return false; } ); //TextReplacer.SearchAndReplace(doc, VariablePrefix + kvp.Key + VariableSuffix, value, false); } }
Later on the WordprocessingDocument is saved via wordDoc.MainDocumentPart.Document.Save()
I am wondering if I need to do something to see changes in a WordProcessingDocument caused by OpenXmlRegex against an XDocument.
Thanks
August 25, 2016 at 1:33 pm #3668Hi Alan,
You may be seeing a problem associated with using the strongly-typed OM vs using LINQ to XML (which Open-Xml-PowerTools uses).
The short answer – before and after using OpenXmlRegex, close and reopen the document.
There is some strange caching in the strongly typed OM that doesn’t play nicely with using other XML technologies. I used to try to deal with this caching and avoid opening and closing the document. However, it is super cheap to open / close, and there are edge cases associated with caching that make it difficult, so my recommendation now is to close and reopen the document when you need to use OpenXmlRegex.
Cheers, Eric
August 31, 2016 at 8:44 am #3706Hi Eric,
thanks for your help. I got this working for the revision/change tracking code. Strangely I had to leave some other code using TextReplacer instead of OpenXmlRegex because I couldn’t get the two object trees (OM vs using LINQ to XML) to sync in memory but only for that case. I have to keep the former object model because it seems easier when inserting new element. I have to do that to append breaks after newline characters otherwise the new lines don’t work and do a little paragraph object manipulation elsewhere for a superfluous blank page issue. Perhaps it would have been easier than I thought to do via Linq to SQL but I like the properties and methods on the OM classes.
Thanks!
-
AuthorPosts
You must be logged in to reply to this topic.