20130824 (Saturday, 24 August 2013)

The bug in my html2xhtml (see 2013-08-21) was even more complex than what I was able to imagine that day. For example the following HTML:

<strong>
<ul>
<em><li>Foo</li></em>
<li>Bar</li>
</ul>
</strong>

Should become:

<ul>
<li><strong><em>Foo</em></strong></li>
<li><strong>Bar</strong></li>
</ul>

After trying to reinvent the wheel and realizing that it was getting sophisticated I discovered pytidylib. Voilà, all problems solved! My html2xhtml is now just a wrapper to pytidylib (which itself is a wrapper to HTML Tidy).

Added new file /requirements.txt which contains “pytidylib”.

Install it either using apt-get install python-tidylib or:

$ sudo apt-get install libtidy-dev
$ pip install pytidylib

Also updated the requirements.txt for Lino Welfare which is at https://code.google.com/p/lino-welfare/source/browse/requirements.txt