Adding/Getting Comments based on character position

Home Forums WordprocessingML Adding/Getting Comments based on character position

Tagged: 

This topic contains 3 replies, has 2 voices, and was last updated by  Eric White 8 years, 2 months ago.

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • #3789

    iunknown
    Participant

    I’m tasked with highlighting and adding new comments to an OpenXML document using it’s Plain Text version (We’re not supporting comments on non-text elements)

    I can pull the plain text version of the docx and I can enumerate the comments, but I’m at a loss on how to get the start and stop position of the comment.

    I have a feeling I’m missing something simple. Do I look at the comment parent and then there’s some offset to get it’s position?

    Or do I have to start at the top of the document and as I pull out the plain text and then ‘remember the start and stop’ when I run across a comment instead of attempting to using the comments subsection?

    Eric, can you point me in the right direction?

    thanks,

    Gene

    #3790

    Eric White
    Keymaster

    Hi Gene,

    I’m not fully clear on your question.

    Comments have markup in the main document part to indicate the start and end of the location of the comment (w:commentRangeStart, w:commentRangeEnd). These elements are situated at the specific location in the document. Then the actual text of the comment is in the comments part, which you must find by following the location.

    You will be interested in the following screen-cast:

    How to Research Open XML Markup

    Key point of that screen-cast: create a word document (without comment), copy the document, in the copy, insert a comment, then use the Open XML SDK productivity tool to compare the two. This will teach you about comment markup.

    Cheers, Eric

    #3794

    iunknown
    Participant

    Thank you Eric,

    What I’m trying to do is pull the plain text and the comments out of the document, which I can do.
    The problem I’m having is knowing where a comment starts and stops in the plain text version.

    I did find the CommentRangeStart and Stop but they don’t expose a document position, that I could find.

    >>These elements are situated at the specific location in the document.
    Based on this statement, I think the approach I have to take is while extracting the plain text, record the current position of any comment starts and stops…

    But that doesn’t work because of the nested nature of OpenXML.

    ugh.

    #3798

    Eric White
    Keymaster

    Yes, you are right, there is not an easy way to get the document position.

    In a recent project (WmlComparer), a module that compares two DOCX files and produces a new document that contains the precise differences between them (with certain restrictions), I transform the DOCX into a new form that is an array of the precise content of the document. Each character and image in the document occupy a single element of the array. This array is put together in such a way that it is possible to reconstruct a valid Open XML document from it. This approach resolves the problems associated with the nested nature of Open XML. You may be interested in watching this screen-cast:

    Introducing WmlComparer, a Module in Open-Xml-PowerTools

    It’s a bit long of a screen-cast, but it can illuminate the proper approach to dealing with this issue.

    I have in mind a generalization of that approach so that developers can do the type of operations that you want to do, i.e. count specific characters, insert comments at any specific point easily, and so on. Writing WmlComparer really helped me to formalize my thoughts about this issue.

    Cheers, Eric

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.