Monday, July 23, 2018

‘datetime.time object’ has no attribute ‘date’

How can lino_xl.lib.cal.Guest.waiting_since contain a datetime.time value?

>>> from lino import startup
>>> startup('lino_welfare.projects.eupen.settings.doctests')
>>> from lino.api.doctest import *
>>> translation.activate('fr')
>>> from django.utils import timezone
>>> now = timezone.now()
>>> now.date()
datetime.date(2018, 7, 23)
>>> type(now)
<type 'datetime.datetime'>

The presence_certificate.body.html was using lino_xl.lib.excerpts.Excerpt.time instead of lino_xl.lib.excerpts.Excerpt.date.

It took me half an hour to verify that this issue was actually already fixed in January 2018. But they didn’t yet upgrade since then, and I didn’t register the issue as a ticket. So now I opened #2443.

Using data from Wikidata

Yesterday I did my first steps with pywikibot.

My first script:

import pywikibot

site = pywikibot.Site('en', 'wikipedia')
page = pywikibot.Page(site, 'User:LucSaffre/sandbox')
page.text = page.text.replace('foo', 'bar')
page.save('Replaced foo by bar')  # Saves the page using the given summary

My second script:

from pprint import pprint
import pywikibot

site = pywikibot.Site('en', 'wikipedia')  # The site we want to run our bot on
if False:  # works
    page = pywikibot.Page(site, 'User:LucSaffre/sandbox')
    page.text = page.text.replace('foo', 'bar')
    page.save('kkk')  # Saves the page

# compare https://www.wikidata.org/wiki/Q3462598
page = pywikibot.Page(site, "Vana-Vigala")
item = pywikibot.ItemPage.fromPage(page)
#print(item.get()['sitelinks'])
claims = item.get()['claims']
for k in claims.keys():
    print(k, claims[k])
    break

My first script with a concrete goal (it would be nice if my commondata project would become useless):

# started on 20180723, continued 20181109
# https://www.wikidata.org/wiki/Wikidata:Pywikibot_-_Python_3_Tutorial/Iterate_over_a_SPARQL_query
# https://www.wikidata.org/wiki/Wikidata:Pywikibot_-_Python_3_Tutorial/Big_Data
# http://tinyurl.com/y9sybx3v
# https://janakiev.com/blog/wikidata-mayors/
import pywikibot
from pywikibot import pagegenerators as pg

# wdt:P31 : instance of
# wd:Q6256 : country
# wdt:P300 : ISO 3166-2 code
# wdt:P297 ?ISO_3166_1_alpha_2_code  https://www.wikidata.org/wiki/Property:P297
query = """
SELECT ?item ?countryLabel ?official_name ?iso_code Where {
  ?item wdt:P31 wd:Q6256.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  OPTIONAL { ?item wdt:P1448 ?official_name }
  OPTIONAL { ?item wdt:P297 ?iso_code. }
}
GROUP BY ?iso_code

"""

query = """
SELECT ?item ?official_name ?ISO_3166_1_alpha_2_code WHERE {
  SERVICE wikibase:label { 
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?item wdt:P31 wd:Q6256.
  OPTIONAL { ?item wdt:P1448 ?official_name. }
  OPTIONAL { ?item wdt:P297 ?ISO_3166_1_alpha_2_code. }
}
LIMIT 1000"""

site = pywikibot.Site("wikidata", "wikidata")
generator = pg.WikidataSPARQLPageGenerator(query, site=site)

for n, i in enumerate(generator):
    # print(i.title())
    ii = i.get()
    print(ii.keys())
    # ['aliases', 'labels', 'descriptions', 'claims', 'sitelinks']
    #print("{} {}".format(i, ii['labels']['en']))
    # print(ii['claims'].keys())
    p = ii['claims'].get("P1448")
    claims = ii['claims'].get("P297")
    if claims is not None:
        assert len(claims) == 1
        iso_code = claims[0]
        # iso_code = iso_code['datavalue']
        # print(ii['descriptions'])
        label_en = ii['labels']['en']
        print("{} {}".format(iso_code, label_en))
        #help(iso_code)
        # help(p)
        break

Next step on that script would be to also print the claims (as explained in Data Harves tutorial). I got stuck when I realized that the ISO code fields seem to be empty when I run that query on https://query.wikidata.org

EDIT: I later also forked the wikidata project on GitHub and got the following script to tun:

from wikidatafun import getAllCountries
for i, c in enumerate(getAllCountries()):
    print(i, c)

TODO: