Monday, June 24, 2024

I would like to validate the e-invoices generated by my Python program. The Peppol BIS Billing 3.0 - November 2023 Release page provides two Schematron files for download. And LXML knows how to validate an XML file using Schematron files. So theoretically everything is clear…

… but both Schematron files from fail to load into LXML. Here is what I tried.

First I wrote a utility function that downloads the file and then instantiates an LXML schematron schema from it.

>>> import requests
>>> from lxml import isoschematron
>>> def load(url, target):
...     r = requests.get(url + target, stream=True)
...     with open(target, "wb") as f:
...         for chunk in r.iter_content(chunk_size=16 * 1024):
...             f.write(chunk)
...     schematron = isoschematron.Schematron(file=target)

The first schematron file, “Schematron for PEPPOL rules (UBL)” is diagnosed by LXML as being invalid:

>>> load("", "PEPPOL-EN16931-UBL.sch")
Traceback (most recent call last):
lxml.etree.SchematronParseError: invalid schematron schema: <string>:498:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element pattern, got let
<string>:0:0:ERROR:RELAXNGV:RELAXNG_ERR_INTEREXTRA: Extra element let in interleave
<string>:8:0:ERROR:RELAXNGV:RELAXNG_ERR_ELEMNAME: Expecting element ns, got title
<string>:7:0:ERROR:RELAXNGV:RELAXNG_ERR_CONTENTVALID: Element schema failed to validate content

The second schematron file, “Schematron for TC434 rules (UBL)” seems to be using the “xslt2” query language:

>>> load("", "CEN-EN16931-UBL.sch")
Traceback (most recent call last):
lxml.etree.XSLTApplyError: Fail: This implementation of ISO Schematron does not work with
    schemas using the "xslt2" query language.