Fictive VAT numbers are different on GitLab¶
April 16–17, 2024
I’m investigating for #5542 (Two VAT doctests fail because generated VAT numbers differ).
The demo
fixture of the vat
plugin assigns a fictive and randomly
generated (but syntactically valid) VAT number to each business partner.
For some reason, the generated VAT numbers differ between my computer and GitLab
CI, causing two doctests (docs/plugins/eevat.rst
and
docs/plugins/bevats.rst
) to fail.
What causes this difference? That’s the question of #5542!
The VAT numbers are assigned by VatNumberManager.generate_vid
. I added a diagnostic log
message in this method:
dd.logger.info("20240416 generated VAT id {} for {}".format(self.obj.vat_id, self.obj))
Then I run inv prep
with the new --verbose
option. The first demo database to get prepared is cosi2
. The resulting
messages are always the same:
20240416 generated VAT id BE 7088.996.857 for Bäckerei Ausdemwald
20240416 generated VAT id BE 4685.739.309 for Bäckerei Mießen
20240416 generated VAT id BE 4181.505.692 for Bäckerei Schmitz
20240416 generated VAT id BE 9045.438.159 for Garage Mergelsberg
20240416 generated VAT id NL 220.876.686B01 for Donderweer BV
20240416 generated VAT id NL 451.948.587B01 for Van Achter NV
20240416 generated VAT id DE 143.956.862 for Hans Flott & Co
20240416 generated VAT id DE 135.079.295 for Bernd Brechts Bücherladen
20240416 generated VAT id DE 138.433.397 for Reinhards Baumschule
20240416 generated VAT id FR 86.915.334.564 for Moulin Rouge
20240416 generated VAT id FR 66.435.589.280 for Auto École Verte
20240416 generated VAT id EE 848.217.541 for Maksu- ja Tolliamet
20240416 generated VAT id BE 4018.258.949 for Electrabel Customer Solutions
The same messages also come on GitLab, but with other VAT numbers:
20240416 generated VAT id BE 2914.517.428 for Bäckerei Ausdemwald
20240416 generated VAT id BE 8750.836.192 for Bäckerei Mießen
20240416 generated VAT id BE 1958.116.531 for Bäckerei Schmitz
20240416 generated VAT id BE 4534.589.652 for Garage Mergelsberg
20240416 generated VAT id NL 237.725.353B01 for Donderweer BV
20240416 generated VAT id NL 643.080.485B01 for Van Achter NV
20240416 generated VAT id DE 928.188.312 for Hans Flott & Co
20240416 generated VAT id DE 593.748.463 for Bernd Brechts Bücherladen
20240416 generated VAT id DE 618.180.575 for Reinhards Baumschule
20240416 generated VAT id FR 65.449.289.186 for Moulin Rouge
20240416 generated VAT id FR 40.268.455.901 for Auto École Verte
20240416 generated VAT id EE 211.892.074 for Maksu- ja Tolliamet
20240416 generated VAT id BE 7659.012.310 for Electrabel Customer Solutions
The seed()
method initializes the random number generator, and if you use
the same seed value twice you will get the same random number twice. I verify
this in Generating fictive VAT numbers (which passes both on my machine and on GitLab).
The generate_vid
is the
only place in Lino where Python’s random
module is used (there is another
import in the users
plugin to generate the verification code, but this is
never called during inv prep
).
Calling seed()
without argument would take the system time (and therefore
yield different random numbers). But we deliberately call random.seed(1)
at the global context of lino_xl.lib.vat.choicelists
, i.e. when that
module is imported. I added another log message at that place:
20240417 random.seed(1)
NB: Until now the import and the random.seed(1)
call had been conditional
(only when dd.is_installed("vat")
), I removed this condition because it’s
not needed and because it adds complexity. But that didn’t fix our problem.
One theoretic possibility was that for some reason the sorting order of the business partners might differ when they get their VAT id. By comparing the output between my machine and GitLab we can now exclude this possibility (IOW we are advancing ;-)
I also had a closer look at the code in lino_xl.lib.vat.choicelists
and
noticed this:
for cc, length in {
"HR": 11,
"DK": 8,
"EE": 9,
"FI": 8,
"FR": 11,
"DE": 9,
"EL": 9,
"HU": 8,
"IT": 11,
"LV": 11,
"LT": 12,
"LU": 8,
}.items():
vat_origins.add_item(cc, VatOrigin(cc, length))
Which means that VatOrigin
objects are instantiated in an order that
can vary. I don’t say that this is the culprit, but it is suspicious… I made a
series of bold simplifications to the code. Result: none.
Explanation¶
I found the explanation on April 20.
While showing the problem to Sharif I had the idea that I could “patch” the
seed()
method of the random generator
so that it logs every time when it is called.
I added the following to lino/__init__.py
:
import random
import inspect
def seed(self, a=None, version=2):
stk = "\n".join(["{}:{}".format(s.filename, s.lineno) for s in inspect.stack()[1:3]])
logger.info("20240420 random.seed(%s, %s) is called from %s", a, version, stk)
self.original_seed(a=a, version=version)
random.Random.original_seed = random._inst.seed
random.Random.seed = seed
random.seed = random._inst.seed
And now:
(dev) luc@yoga:~/work/book/lino_book/projects/cosi2$ pm prep
20240420 random.seed(None, 2) is called from /usr/lib/python3.10/random.py:125
/home/luc/virtualenvs/dev/lib/python3.10/site-packages/sympy/core/random.py:29
20240420 random.seed(None, 2) is called from /usr/lib/python3.10/random.py:125
/home/luc/virtualenvs/dev/lib/python3.10/site-packages/sympy/core/symbol.py:419
20240420 random.seed(None, 2) is called from /usr/lib/python3.10/random.py:125
/home/luc/virtualenvs/dev/lib/python3.10/site-packages/sympy/ntheory/ecm.py:7
20240420 random.seed(None, 2) is called from /usr/lib/python3.10/random.py:125
/home/luc/virtualenvs/dev/lib/python3.10/site-packages/sympy/ntheory/qs.py:8
20240420 random.seed(1, 2) is called from /home/luc/work/xl/lino_xl/lib/vat/choicelists.py:27
<frozen importlib._bootstrap>:241
20240420 random.seed(None, 2) is called from /usr/lib/python3.10/random.py:125
/usr/lib/python3.10/tempfile.py:285
20240420 random.seed(None, 2) is called from /usr/lib/python3.10/random.py:125
/usr/lib/python3.10/tempfile.py:285
We are going to flush your database (/home/luc/work/book/lino_book/projects/cosi2/settings/default.db)
AND REMOVE ALL FILES BELOW /home/luc/work/book/lino_book/projects/cosi2/settings/media.
Are you sure (y/n) ? [Y,n]?
This made me realize that after calling random.seed()
from
lino_xl.lib.vat.choicelists
, it gets called two more times from
tempfile
. Ha! No need to dig more! We must simply use our own random
generator in lino_xl.lib.vat.choicelists
!