20120301¶

xsd2py¶

Yesterday I started some with first concrete attempts for the tx25 project. While watching the XSD files I thought “No, I won’t convert them manually to an xmlgen module, I’d rather write a tool like generateDS, but one that produces Python code which uses lino.utils.xmlgen”. I called it xsd2py.

I then started to understand that Dave Kuhlmann’s work is not really trivial.

Yes, I admit I’m crazy: I’m even going to continue. Just because I percieve the code generated by generateDS as difficult to understand and maintain.

Ian Bicking writes about xml.dom.minidom: “a document model built into the standard library, which html5lib can parse to. (I do not recommend using minidom for anything — some reasons will become apparent in this post, but there are many other reasons not covered why you shouldn’t use it.)” (http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/)

In that same post he also writes “I expected lxml to perform well, as it is based on the C library libxml2. But it performed better than I realized, far better than any other library. As a result, if it wasn’t for some persistent installation problems (especially on Macs) I would recommend lxml for just about any HTML task.”

So I’m going to use lxml.

After reading http://lxml.de/tutorial.html it was easy to write a new version xsd2py which produces the same output, but uses lxml instead of minidom.

But then, when I reached the E-factory <http://lxml.de/tutorial.html#the-e-factory>´_ section, I started to understand that this is the wheel my :mod:`lino.utils.xmlgen is reinventing!

Confirmation: I rewrote table2xhtml() using lxml instead of xmlgen.

Next step weill be to do the same with lino.utils.xmlgen.bcss.

So my xmlgen is going to be deprecated because lxml is much better and faster.