Thursday, September 13, 2018¶
Sphinx 1.8 and feedformatter¶
I installed and tried the new Sphinx 1.8. There was a problem when generating my blog:
Traceback (most recent call last): File "/site-packages/sphinx/cmd/build.py", line 304, in build_main app.build(args.force_all, filenames) File "/site-packages/sphinx/application.py", line 369, in build self.emit('build-finished', None) File "/site-packages/sphinx/application.py", line 510, in emit return self.events.emit(event, self, *args) File "/site-packages/sphinx/events.py", line 80, in emit results.append(callback(*args)) File "/sphinxfeed/sphinxfeed.py", line 95, in emit_feed feed.format_rss2_file(path) File "/site-packages/feedformatter.py", line 399, in format_rss2_file string = self.format_rss2_string(validate, pretty) File "/site-packages/feedformatter.py", line 393, in format_rss2_string return _stringify(RSS2root, pretty=pretty) File "/site-packages/feedformatter.py", line 273, in _stringify return ET.tostring(tree) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1126, in tostring ElementTree(element).write(file, encoding, method=method) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 820, in write serialize(write, self._root, encoding, qnames, namespaces) File "/etgen/etgen/etree.py", line 29, in _serialize_xml return _original_serialize_xml(write, elem, *args, **kwargs) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml _serialize_xml(write, e, encoding, qnames, None) File "/etgen/etgen/etree.py", line 29, in _serialize_xml return _original_serialize_xml(write, elem, *args, **kwargs) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml _serialize_xml(write, e, encoding, qnames, None) File "/etgen/etgen/etree.py", line 29, in _serialize_xml return _original_serialize_xml(write, elem, *args, **kwargs) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml _serialize_xml(write, e, encoding, qnames, None) File "/etgen/etgen/etree.py", line 29, in _serialize_xml return _original_serialize_xml(write, elem, *args, **kwargs) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 937, in _serialize_xml write(_escape_cdata(text, encoding)) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1073, in _escape_cdata return text.encode(encoding, "xmlcharrefreplace") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)
I opened #2534 and explored the problem:
Seems that this is caused by either
etgen : I removed the patch in
etgen.etreebecause it lookeds suspicious. But that didn’t help. So
etgen.etreeis probably innocent. That patch for writing CDATA has maybe become useless, but I am not sure, so I leave it there at the moment.
feedformatter : I tried a newer version. Also that didn’t help. Note that feedformatter seems to be unmaintained. The PyPI version is 0.4 still points to code.google.com but there are two forks. I created a third fork but deleted it again when I found thte explanation below.
Adding a try…except in my
/usr/lib/python2.7/xml/etree/ElementTree.py finally revealed
the explanation which I am going to simulate here:
sphinxfeed sets the pubDate field of feed items to a time_struct:
>>> import time >>> fmt = '%Y-%m-%d %H:%M' >>> pubDate = time.strptime("2018-03-13 11:07", fmt) >>> pubDate time.struct_time(tm_year=2018, tm_mon=3, tm_mday=13, tm_hour=11, tm_min=7, tm_sec=0, tm_wday=1, tm_yday=72, tm_isdst=-1)
When sphinxfeed then calls feedformatter, feedformatter writes all
dates using the format demanded by the RSS 2.0 specification which itself refers
to the venerable RFC 822
- 25 - in that document to get to the “5. DATE AND
TIME SPECIFICATION” section). Anyway, here is how a pubdate field in
an rss.xml file should look like:
>>> s = time.strftime("%a, %d %b %Y %H:%M:%S %Z", pubDate) >>> repr(s) "'Tue, 13 Mar 2018 11:07:00 '"
Now Sphinx version 1.8 (at least on my machine) sets the locale to Estonian:
>>> import locale >>> locale.setlocale(locale.LC_ALL, 'et_EE.utf8') 'et_EE.utf8'
And feedformatter now gets a localized string containting non-ascii characters which under Python 2 is not even a unicode string but a bytestring:
>>> s = time.strftime("%a, %d %b %Y %H:%M:%S %Z", pubDate) >>> type(s) <type 'str'> >>> repr(s) "'T, 13 m\\xc3\\xa4rts 2018 11:07:00 '"
And when trying to serialize that bytestring, we get our decoding error:
>>> s.encode("ascii", 'xmlcharrefreplace') Traceback (most recent call last): File "/usr/lib/python2.7/doctest.py", line 1315, in __run compileflags, 1) in test.globs File "<doctest 0913.rst>", line 1, in <module> s.encode("ascii", 'xmlcharrefreplace') UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)
It is true that I live in Estonia and that my Ubuntu system probably
has some setting seomwhere saying this. But in my
language = 'en'
So why does Sphinx version 1.8 set the locale to “Estonian” on my
machine? It is because of the environment variable
I can work around the problem by setting this variable to
en_GB.UTF-8 before building:
$ export LC_TIME=en_GB.UTF-8
I added a unit test in my sphinxfeed clone which reproduces the
LC_TIME=et_EE.UTF-8) : The test suite passes with
“Sphinx<1.8” and fails with 1.8.
But setting my
en_GB.UTF-8 is not really a
Lino and WeasyPrint¶
The new accounting report shows us that WeasyPrint is a great tool for most Lino printing jobs. That’s why I invested some time into trying to find out who’s behind this package.
Oh, here is a post by its author (gayoub from kozea group) where he explains why he wrote WeasyPrint: Comment générer automatiquement des jolis documents ? It’s so nice to read about somebody who shares similar experiences and feelings about producing printable documents!
Later I read another blog post by the Kozea group: Philippe et sa montre, an interview with Philippe Donadieu, manager of the Kozea group. Their main product is a suite of software solutions for drugstores in France. It seems to be proprietary software, though.
And who is Simon Sapin, the first author mentioned in that file? According to his site exyr.org he has previously worked on WeasyPrint at Kozea. In 2012 he presented WeasyPrint at W3C Developer Meetup in Lyon. On the slides I read that Kozea had 10 employees at that time, is located in the Lyon area and builds custom web applications for businesses (“Industrialization, HTML5/CSS3 e-learning and Semi-automated reporting”). And that they recently became a W3C member. Which seems to be no longer true (at least they aren’t listed here).
Their community website finally confirms that they invite us to collaborate or to just tell them about ourselves.
It seems that Simon left the Kozea community when he left his job there, and that he has moved away from Python to Rust since then.
WeasyPrint was written and is maintained by a “corporate-driven community”. But other than the Python extension for Visual Studio Code (see Monday, September 10, 2018) this is what I would call a corporate working for a free culture because their product serves also people who are not customers of the corporate. That’s why Kozea is more sympathic than Microsoft for me.
Hi Simon and Guillaume, I’d like to say thank you for the great work
you have done and are doing on WeasyPrint! I hope that its
maintenance will continue to give you much joy and satisfaction. At
the moment we just use WeasyPrint (in the
lino.modlib.weasyprint plugin), and WP simply works as
expected. This is great! Don’t expect active contributions because
we have other things to do as well. Let us know if you see how we can
Lino Tera für Therapeuten weiter¶
Überfällige Termine : nicht schon die von heute, erst ab gestern.
users.UserDetail hat keine Reiter (Dashboard, event_type, …)
Changed the symbol for a “Cancelled” calendar entry in
lino_xl.lib.calfrom ☉ to ⚕. Because the symbol ☉ (a sun) is used in Lino Tera for events where the guest missed the appointment without a valid reason. The sun reminds a day on the beach while the ⚕ reminds a drugstore.
Neuer Stand “Verpasst” (“Missed – Guest missed the appointment”) für Termine.
Note the analogy: a guest (participant) can be “absent” or “excused”, an appointment can be “missed” or “cancelled”. In Lino Tera we will need this analogy because they have a mixture of group appointments and individual appointments.