Eric White

Forum Replies Created

Viewing 15 posts - 16 through 30 (of 253 total)
  • Author
    Posts
  • in reply to: HtmlToWmlConverter adding page numbers #4166

    Eric White
    Keymaster

    Hi Garth,

    At the moment it is not possible.

    Adding page numbers (and headers / footers) would require writing a layout engine. AFAIK, right now there is no open source layout engine. Without a layout engine, one can’t do pagination. Without pagination, there is no way to determine the page number, etc.

    In the document, there are w:lastRenderedPageBreak elements. Unfortunately, there are ‘bugs’ in Word that cause these to be put into the wrong places in a number of circumstances. I put quotes on ‘bugs’ because Word does not guarantee the correct placement of these elements, so Microsoft doesn’t consider these to be bugs, I believe.

    So currently, there is no good path to do this.

    Best, Eric

    in reply to: Issue with listnum track changes(Revisions) #4147

    Eric White
    Keymaster

    Hi Manu,

    The list items are never stored in the content. They are always calculated. A change in list numbering does not affect this. You can detect that the list numbering has changed, but if you actually want the deleted list items, it is pretty complicated, and not at your fingertips.

    I guess that I would need to hear more about the problem you are trying to solve. What is your user scenario?

    Best, Eric

    in reply to: Issue with listnum track changes(Revisions) #4144

    Eric White
    Keymaster

    Hi Manu,

    I am not quite clear about your question – need more info.

    With regards to the deleted listnum, what exactly are you referring to? In Word itself, we see the list items, which is the displayed representation of any paragraph that has list numbering. In the open xml markup, there is the listnum attribute for every paragraph that has list numbering. You can find this listnum attribute in the paragraph itself, or in the styles part for the specific style.

    The styles part contains tracked revisions, including changes to styles where numbering has changed.

    With regards to calculating the list item for any paragraph (what the ListItemRetriever.cs module does), processing list items is pretty complicated. It took several tries before I found every numbering bug in ListItemRetriever.cs. But ListItemRetriever.cs presumes that there are no tracked revisions. I have not yet contemplated the problem of determining list items in a document that contains tracked revisions.

    Please give me more information with specifics, and I’ll be happy to help 🙂

    Cheers, Eric


    Eric White
    Keymaster

    Hi,

    Looking at your code, nothing looks incorrect to me, but I normally use LINQ to XML to access and manipulate content instead of the strongly-typed object model, so I may be missing something.

    In general, if you need to generate documents from data, I recommend using the DocumentAssembler module in Open-Xml-PowerTools.

    http://www.ericwhite.com/blog/blog/documentassembler-developer-center/

    Watch those videos – you can do document assembly without writing code.

    If you go down that path, and if you continue to use the strongly-typed object model, I recommend closing and opening the document any time you switch between using LINQ to XML and the strongly-typed object model.

    Best, Eric

    in reply to: Could not load file or assembly System.IO.Packaging #4137

    Eric White
    Keymaster

    Hi,

    I’m afraid that I haven’t much to do these days with the NuGet package and dependencies. I didn’t create those, and haven’t used them. I’m sure they are great; I just don’t know much about them.

    Sorry I don’t have a better answer for you, but thought I’d let you know…

    Personally, I work directly from the github repos, and I’m afraid I am lazy and don’t try out the other various ways to use the Open-Xml-Sdk and Open-Xml-PowerTools. But I’m sure you can get it working 🙂

    Cheers, Eric

    in reply to: Bullets and numbering extra indentation #4134

    Eric White
    Keymaster

    I don’t have quite enough information to help with this. Can you post the markup for the entire paragraph, and where you think it wrong? Feel free to post a doc online somewhere if you want me to examine the markup.

    There is an issue I see with your description – tabs are not ‘additive’ – they are directly positioned.

    There is another issue which is that Word has a fairly complex algorithm for text positioning, including that if a word extends past a tab position, then Word automatically ‘creates’ a tab based on specific settings in the Open XML document. There can be issues associated with transforming Open XML (rendered by Word) to HTML (rendered by browsers) where font metrics are ever so slightly different, and something that fits within one tab as rendered in Word extends beyond the tab, rendered in the browser, causing text to ‘shift over’.

    The WmlToHtmlConverter project was never intended to do a pixel by pixel rendering of the document. It isn’t possible without writing a layout engine that is compatible feature-for-feature with Word, and this would include using a Word compatible text renderer (which browsers are not). There is enough mismatch between the layout system of Word and the layout system of HTML that you just have to do the best you can, and then don’t worry too much about it.

    The intent of WmlToHtmlConverter is to give a pretty good representation of the document, but it can’t be perfect.

    With regards to your specific case, it is important to understand exactly what is going on, and it is possible that the “extra” node is adjusting the ‘tabbing’ system (which calculates spans with a given width) in an unexpected way.

    Best, Eric

    in reply to: wml comparer – can't compare object like textboxes #4117

    Eric White
    Keymaster

    Yes, you are right. That is a current limitation of the module. I have tentative plans to work on that module, enhancing it to support text boxes, and nested tables. However, there is no schedule for this right now.

    Best, Eric

    in reply to: Continuous Numbering issue #4113

    Eric White
    Keymaster

    Sorry, I’m still not clear. Are you saying that the behavior of Word is incorrect? Or the behavior of Open-Xml-PowerTools is incorrect? Which module?

    Best, Eric

    in reply to: Continuous Numbering issue #4105

    Eric White
    Keymaster

    Hello Manu,

    I am not certain exactly what you are referring to. Are you having issues with one of the modules in Open-Xml-PowerTools, such as WmlToHtmlConverter?

    in reply to: DocumentBuilder: Image gets lost #4103

    Eric White
    Keymaster

    Hi Michaela,

    It certainly is possible to enhance DocumentBuilder so that it handles external links. This modification can fit into the existing structure of DocumentBuilder with no issues. It probably is about 4-6 hours of work, including adding XUnit tests. You can give it a try to make the changes.

    If you want me to make those changes, I am available on a consulting basis to do so.

    Best, Eric

    in reply to: DocumentBuilder: Image gets lost #4099

    Eric White
    Keymaster

    Hi Michaela,

    I have confirmed, yes, the DocumentBuilder does not handle external links. There are security issues associated with this, as well as technical issues. Actually, I think that from a security perspective, the behavior of DocumentBuilder should be to throw an exception when it encounters one of these links.

    I would love to hear your scenario, where external links are important when using DocumentBuilder. If you have time, would you write a paragraph or two as to why they are important?

    Best, Eric

    in reply to: DocumentBuilder: Image gets lost #4098

    Eric White
    Keymaster

    Hi Michaela,

    If I remember correctly, I didn’t address links to external documents. Such links are problematic at best, and I could not think of a scenario for DocumentBuilder where such links are important. I confess that I didn’t even think about what to do with those links, so I’m not surprised the code is broken.

    I’ll take a quick look, and see if there is an easy fix.

    Best, Eric

    in reply to: DocumentBuilder: Image gets lost #4093

    Eric White
    Keymaster

    Hi,

    Can you please post a test document (on dropbox or some such) and provide a link?

    DocumentBuilder should propagate all images into the assembled document, and if it does not, that would be a bug.

    Best, Eric

    in reply to: Nested Fields #4088

    Eric White
    Keymaster

    Hi Manu,

    Whenever dealing with fields, and certainly nested fields, I use the FieldRetriever.cs module in Open-Xml-PowerTools. This module retrieves (in nested form) all fields in a DOCX. Further, auxiliary information is in the data structures returned by FieldRetriever, such as object references to the various markup elements in the main document part. You can use those object references to alter / query the document.

    The FieldRetriever module serves two purposes – it provides nicely packaged functionality to C# developers to understand the fields in a document, including nested fields. Further, it provides a reference implementation regarding the correct approach for retrieving the instrText and the representation of fields in a document.

    Please watch screen-casts #14 and #15 in the following series:

    Introduction to WordprocessingML

    Cheers, Eric

    in reply to: Hyperlink fieldcode #4075

    Eric White
    Keymaster

    Hi Manu,

    The exact format of the text of field codes is defined in the Open XML standard. You must be prepared for those beginning and ending spaces.

    I believe that there is a grammar for the text of fields, if I remember correctly, so should be possible to build a small parser for it.

    -Eric

Viewing 15 posts - 16 through 30 (of 253 total)