Eric White

Forum Replies Created

Viewing 15 posts - 91 through 105 (of 253 total)
  • Author
    Posts
  • in reply to: OpenXml Ranges Similar to Word Interop #3678

    Eric White
    Keymaster

    The exact precise position of characters in documents is an interesting question.

    I have done work in this area – I recently completed a new module in Open-Xml-PowerTools: WmlComparer.

    The work that I’ve done in this new module will definitely inform future Open-Xml-PowerTools development. If wanting to calculate exact offsets, the approach I took in WmlComparer is similar to the one I would take.

    Introducing WmlComparer, a Module in Open-Xml-PowerTools

    Given that we don’t have a way to calculate exact offsets right now, it might be good to take a step back and ask what you are attempting to do? What is your scenario? What user problem are you trying to solve? I may be able to recommend a different approach to solving your issue.

    Cheers, Eric

    in reply to: Using Content Select to insert formatted text blocks #3674

    Eric White
    Keymaster

    Hi Mark,

    Yes, indeed, importing a document into another document is a simpler case of importing HTML, and this would also be a feature of the new DocumentAssembler.

    The great thing is that because this new code would rely on existing Open-Xml-PowerTools code, it would not be a huge effort to write. I designed the system with this idea in mind.

    As with all such efforts, creating XUnit tests, and productizing the code requires just as much (or more) time than actually writing the code.

    I hope to tackle this functionality in the next 2-3 months, but we’ll see how my schedule works out.

    Regards, Eric

    in reply to: MainDocumentPart.Annotation returns null #3673

    Eric White
    Keymaster

    I don’t have enough information in order to develop a theory about your issue. Why are you writing code to get the annotation, instead of calling GetXDocument on the part?

    A good way to start is from one of the existing, working Open-Xml-PowerTools examples. They work out-of-the-box, and then you make modifications to the code, customizing to your scenario.

    in reply to: Merging style inheritance elements #3672

    Eric White
    Keymaster

    You should also take notice of font merging semantics. See the FontMerge function in FormattingAssembler.

    In a past project, when I needed to test against something that might be defined in the paragraph or might be defined in a style, I added a GUID to the beginning of the text of the paragraph. I then process the document using FormattingAssembler, open the processed document, find the paragraph with my GUID, then look at the paragraph and run properties, which will be correct according to the style chain, default document properties, and of course direct styling on the paragraph. In the case of tables, it gets even more complicated because of how table style inheritance, and conditionally applied formatting is applied to tables. FormattingAssembler takes care of all of that stuff – after processing with it, you need only to look at the local properties to see whether text is bolded, or has some particular foreground or background color.

    in reply to: Issue while Exporting excel template #3671

    Eric White
    Keymaster

    I don’t have any existing samples that shows how to accomplish what you want. I do have examples that generate spreadsheets, specifically the streaming example that enables creating huge spreadsheets.

    Screen-Cast: Using Open XML and LINQ to XML in a Streaming Fashion to Create Huge Spreadsheets

    That example focuses on the streaming approach, with a minimalist approach to formatting. Controlling drop downs lists could be thought of as an advanced form of formatting.

    I have often wanted to put far more effort into SpreadsheetML, but due to the massive demand for tools / process for WordprocessingML, to date I have not had the opportunity to do so. I know that there is demand to be able to generate spreadsheets in a far easier fashion, and controlling the drop down lists is an interesting and important scenario.

    Cheers, Eric

    in reply to: How to identify '\r' carriage return in openxml #3670

    Eric White
    Keymaster

    Hi Manu,

    I’d have to see the document to understand what you are seeing.

    One thing that you can’t get from the XML is the layout of the text – you can’t see how Word wraps lines and where the wrapping breaks are. This information is calculated by a layout engine inside of Word.

    Cheers, Eric

    in reply to: Replacing Text in Revision Tracked Document #3668

    Eric White
    Keymaster

    Hi Alan,

    You may be seeing a problem associated with using the strongly-typed OM vs using LINQ to XML (which Open-Xml-PowerTools uses).

    The short answer – before and after using OpenXmlRegex, close and reopen the document.

    There is some strange caching in the strongly typed OM that doesn’t play nicely with using other XML technologies. I used to try to deal with this caching and avoid opening and closing the document. However, it is super cheap to open / close, and there are edge cases associated with caching that make it difficult, so my recommendation now is to close and reopen the document when you need to use OpenXmlRegex.

    Cheers, Eric


    Eric White
    Keymaster

    Hi,

    Sorry for the slow response – have been on vacation, and dealing with illness of a family member.

    Currently, the scenario you describe is not a feature of OpenXmlRegex. It is a great idea, though. I have added it to the list of possible enhancements to OpenXmlRegex.

    Best, Eric

    in reply to: Hyperlink to Bookmark #3666

    Eric White
    Keymaster

    As you know, you can create a bookmark by inserting w:bookmarkStart / w:bookmarkEnd elements at the appropriate place in the markup:

        <w:p>
          <w:bookmarkStart w:id="1"
                           w:name="TestBookmark"/>
          <w:bookmarkEnd w:id="1"/>
          <w:r>
            <w:t>To make your document look professionally produced.</w:t>
          </w:r>
        </w:p>

    You can create a link to this bookmark by inserting the following markup:

        <w:p>
          <w:hyperlink w:anchor="TestBookmark"
                       w:history="1">
            <w:r w:rsidRPr="001B53E7">
              <w:rPr>
                <w:rStyle w:val="Hyperlink"/>
              </w:rPr>
              <w:t>TestBookmark</w:t>
            </w:r>
          </w:hyperlink>
          <w:r>
            <w:t>Video provides a powerful way.</w:t>
          </w:r>
        </w:p>

    The w:hyperlink element should be a child of the paragraph, and a sibling to the runs.

    -Eric

    in reply to: Determining Paragraph Indexes #3665

    Eric White
    Keymaster

    Hi,

    If you look at line 431 in DocumentBuilder.cs, you can see where it selects the content to be included in a source document:

        List<XElement> contents = doc.MainDocumentPart.GetXDocument()
            .Root
            .Element(W.body)
            .Elements()
            .Skip(source.Start)
            .Take(source.Count)
            .ToList();

    If you are not seeing what you want in the merged document, you can look at what is selected in the above statement, see where the discrepancy is. You can calculate the Start and Count such that this statement will return the correct content.

    Looking at your code, nothing stands out as incorrect. Mainly, when I code for this scenario, I actually count elements in the list, rather than using FindIndex, but it seems to me that FindIndex would work just as well.

    My recommendation is to look at the results of the above query, and see why you are not getting the content you want to get.

    Let me know how I can help further…

    Cheers, Eric

    in reply to: Form feed #3664

    Eric White
    Keymaster

    The \f as received from Word interop denotes the section, I believe. The existence of the section is based on the w:sectPr element. There is a strange aspect of the Open XML markup where the last section props for the document is stored in an element that is a child of the w:body element (in other words, a sibling to the last paragraph), whereas other section props are stored in the last paragraph of a section. See the screen-cast on sections and headings in the following series:

    Introduction to WordprocessingML

    By looking for the section properties, you can figure out where the section breaks are.

    Does this help? Please ask more if you need more info.

    in reply to: List Numbering on Merged Docs #3663

    Eric White
    Keymaster

    Hi Alan,

    I presume that you are using the DocumentBuilder module, correct? I ask this because it is possible to encounter a similar issue with DocumentAssembler…

    This is one of the more complex issues associated with using DocumentBuilder – something that I have wanted to address for years, but have never had the time.

    Here is the idea way to fix this issue. It may seem complex, but really is not too bad. The gist of this approach is that if you want to have lists operate in isolation, then they need to have a unique w:numId and w:abstractNumId for each list.

    Each numbered list, regardless of whether it is numbering based on style, or numbering directly applied to the paragraph, has a w:numId associated with it. In the numbering part, this w:numId refers to a w:abstractNumId. Numbering is calculated based on the abstractNumId.

    So if you want your lists to count in isolation, then each list needs to have a unique numId and abstractNumId.

    Let’s say that you have Document1 and Document2 that each contain lists with the same numId and abstractNumId, probably because they originated from the same document. The pseudo code would be:

    • Find out the maximum number of numId and abstractNumId in each of the documents. MAXNUMID=maximum of numId. MAXABSTRACTNUMID=maximum of abstractNumId.
    • Leave Document1 as is.
    • Go through Document2, adding MAXNUMID to each definition and reference of a numId value. These values need to be changed in the main document part, in the styles part, and in the numbering part.
    • Go through Document2, adding MAXABSTRACTNUMID to each definition and reference of an abstractNumId value. These values need to be changed in the numbering part only.
    • At the end of this process, the w:numId values will be unique in each document, as well the w:abstractNumId values. The documents will then merge with DocumentBuilder in such a way that each numbered list will count in isolation.

    Ideally, there would be an option in DocumentBuilder that would enable you to specify that the lists in the document should be processed in isolation, or that the lists should be merged with lists in other source documents. I’ve added this item to my list of possible enhancements for Open-Xml-PowerTools.

    Cheers, Eric

    in reply to: HorizontalPositionRelativeToPage #3657

    Eric White
    Keymaster

    Hi,

    This would require a layout engine, and the Open-Xml-Sdk does not contain a layout engine.

    As far as I know, there is no open source Open XML layout engine. There may be proprietary layout engines, but I have no experience with them.

    However, if I had to calculate something like this, I would first use the FormattingAssembler module in Open-Xml-PowerTools to ‘roll-up’ all of the styling information into local styling.

    Screen-Cast: Introducing the FormattingAssembler Module

    Watch the screen-cast at the bottom.

    After running FormattingAssembler, you have a new Open XML document where all styling (including positioning of the paragraph) are stored locally in the paragraph and run properties. You can then calculate positions based on the values stored in the markup.

    But this is not going to be the same as using Word automation, which takes advantage of the internal layout engine in Word.

    Cheers, Eric

    in reply to: TEXT INSIDE A SECTION #3651

    Eric White
    Keymaster

    There isn’t a way to do this using the Open XML SDK directly, but it is not hard.

    One thing to review is how revision markup is put into the main document part. Check out screen-cast #8 in the following series:

    Introduction to WordprocessingML

    There is a DocumentBuilder example that shreds a document into sections. See the second example in the DocumentBuilder02.cs example, in Open-Xml-PowerTools. This might be an easy solution to your issue – first shred the document into sections, and then open the desired shredded document, then retrieve the text.

    in reply to: Replacing Text in Revision Tracked Document #3650

    Eric White
    Keymaster

    Hi,

    You should be using OpenXmlRegex, not TextReplacer. OpenXmlRegex can do everything that TextReplacer can do, and a lot more, including replacing text in a document that contains tracked revisions.

    http://www.ericwhite.com/blog/blog/openxmlregex-developer-center/

    Cheers, Eric

Viewing 15 posts - 91 through 105 (of 253 total)