xhtml() doesn’t like <div> inside a <span>¶
You should be able run the snippets on this page and reproduce the problem
by downloading files bug.rst
and 0821.odt
from
/docs/blog/2013/0821 into a folder of your choice
and then running:
$ python -m doctest bug.rst
0821.odt
contains a simple appy pod from clause:
do text
from xhtml(chunk)
We are going to render this into a file out.odt
:
>>> OUTFILE = 'out.odt'
Alternatively you might try to render to a .pdf file if you have an
openoffice or libreoffice server running on port 2002 (uncomment the
following line in your copy of bug.rst
):
>>> # OUTFILE = 'out.pdf'
When chunk is the following, then it works:
>>> html = u'<p><div><span>it works!</span></div></p>'
But when I inverse the nesting (<div> inside <span>) then it fails:
>>> html = u'<p><span><div>Oops</div></span></p>'
Another example of HTML as TinyMCE happens to produce is this:
>>> html = '<strong><ul><li>Foo</li><li>Bar</li></ul></strong>'
Here is how it should be:
>>> html = u'<ul><li><strong>Foo</strong></li><li><strong>Bar</strong></li></ul>'
The following snippet will try to render it:
>>> import os
>>> from appy.pod.renderer import Renderer
>>> html = html.encode('utf-8')
>>> context = dict(chunk=html)
>>> if os.path.exists(OUTFILE):
... os.remove(OUTFILE)
>>> r = Renderer('0821.odt',context,OUTFILE)
>>> r.run()
>>> os.path.exists(OUTFILE)
True
The file out.odt now exists, but it contains invalid content.xml and LibreOffice will complain when you try to open it.
I originally wrote this page for Gaëtan in the hope that he will
fix this bug in appy pod… but then I understood:
in fact Appy Pod is right!
A <div> inside a <span> is no valid XHTML.
A <li> inside a <strong> is no valid XHTML.
According to
Mac on stackoverflo
“several websites use this method for styling”,
but the bug is not in Gaëtan’s renderXhtml method,
it is in my own code: in lino.utils.html2xhtml
.
(Edit 20130823: added the <li> inside <strong> example)