I recently completed a new version of ListItemRetriever.cs – which is a super important module in PowerTools for Open XML, although it operates mostly behind the scenes. This module is responsible for translating the various pieces of markup for numbered and bulleted lists into the actual text that HtmlConverter.cs will place in the generated HTML. It was a test of my patience – I first patched the old version, then I re-wrote it, and then I threw it all out and re-wrote it again. I am finally happy with it.
Now that this module is completed, it is time to jump back into some serious coding for the high-fidelity HtmlConverter.cs module. My next goal is to complete Right-To-Left languages, and East Asian languages.
I always try to do the hard stuff first, and because of my unfamiliarity with the Open XML markup that I need to parse, this is somewhat hard. Also, because I don’t read any RTL or East Asian languages, I have to do this by pattern matching. Sure would be easier if I could read them… 🙂
I don’t have a good idea of how long it will take.
Some time ago I wrote a crude program that uses search engines to find Open XML documents on the web and download them. I have a pretty large collection of them – in general, my work will be to run HtmlConverter.cs on these documents and manually compare the docs in Word with the converted HTML in a browser. Fun.
But where I’m going – I want to have this high-fidelity conversion from DOCX to HTML in really good shape in the next 2-3 months.
Following that, I want to re-write the portions of PowerTools that we use from PowerShell. I want to re-write all of the cmdlets using the PowerShell language, not using C#. After re-writing the cmdlets, I believe that the process to install and use the cmdlets will be a matter of dropping some files in a specific place. It also will make it much easier for users of PowerTools to build new cmdlets, and to modify the existing cmdlets.
This is my vision for PowerTools for Open XML 3.0.