Skip to main content

A simple Python package to represent events from Wikipedia and Wikidata resources.

Project description

wikivents

This Python package is used to represent events based on both ontologies and semi-structured databases. Up to now, only Wikidata and Wikipedia are implemented and provide these data. The name of this package, wikivents show this legacy.

Events

An event is an ontology entity. For Wikidata, an event is an instance of an occurrence (Q1190554) or of an event (Q1656682).

A event is defined by few characteristics:

  • A type: on Wikidata, this defines the type of the event. For instance, rebellion (Q124734) on Wikidata. This information is provided by the wd:P31 or rdf:type, depending on the source.
  • Date of occurrence, converted if necessary in the Gregorian calendar.
  • Location where the event happened.
  • The entities related to the event. The extraction process is based on semi-strucuted databases, which are processed in search of entities (people involved, places, etc). Entities are counted. This index shows the relevance of the entity in relation to the event. We assume the entities found in multiple lead sections are important entities in the event description.

Usage

The process of gathering data from multiple APIs (ontologies and semi-structured databases) can take a long time. We implemented a cache which speeds up queries and prevents from querying twice the same data.

Up to now, the only implemented toolchain is based on Wikimedia foundation projects: Wikidata for the ontology and Wikipedia articles for semi-structured databases. Then it is possible to get the representation of an event using the WikimediaFactory object, as in the example below:

from wikivents.factories.wikimedia import WikimediaFactory
from wikivents.models import EntityId

easter_rising_entity_id = EntityId("Q193689")
easter_rising_event = WikimediaFactory.create_event(easter_rising_entity_id)

The previous code takes the Easter Rising event entity id (Q193689)' as input. If using another Factory which processes other ontologies, the entry point will obviously not be the same.

From the easter_rising_event object, it is possible to access multiple attributes:

  • The event identifier, which comes from the ontology itself:
easter_rising_event.id
Out[5]: 'Q193689'
  • The event label, in any possible language. If unavailable, will return an empty string.
easter_rising_event.label("fr")
Out[7]: 'Insurrection de Pâques 1916'
  • The event alternative names, which means all the names that are used in a specific language to speak about the event.
easter_rising_event.names("de")
Out[8]: {'Easter Rising', 'Irischer Osteraufstand 1916', 'Osteraufstand'}
  • The event description, in plain text.
easter_rising_event.description("en")
Out[9]: 'an armed insurrection in Ireland during Easter Week, 1916'
  • The event boundary dates, beginning and end.
easter_rising_event.beginning, easter_rising_event.end
Out[12]: (datetime.datetime(1916, 4, 24, 0, 0, tzinfo=datetime.timezone.utc),  datetime.datetime(1916, 4, 30, 0, 0, tzinfo=datetime.timezone.utc))
  • The entities involved, accessible using the gpe property for GPE, org property for ORG and per property for PER entities.
easter_rising_event.gpe
Out[13]: [ParticipatingEntity(entity=<Entity(Q1761, Dublin)>, count=8),ParticipatingEntity(entity=<Entity(Q27, Ireland)>, count=5)]

Encode events in reusable formats

The wikivents library also provides encoders which are used to transform the event object into other formats, easier to manipulate. From now on, we provide a DictEncoder which make it easy to create a JSON file from it.

from wikivents.model_encoders import EventToDictEncoder
from wikivents.models import ISO6391LanguageCode

encoder = EventToDictEncoder(easter_rising_event, ISO6391LanguageCode("en"))
encoder.encode()

The purpose of encoders is to provide all the possible knowledge acquired about events and to show involved entities. It is also possible to add a parameter to the encode() method, participating_entities_ratio_to_keep to indicate which entities will be kept. It is a value comprised between 0 and 1. 1 means to keep all the participating entities, 0.5 to keep them only if they were found in at least 50% of the total number of processed semi-structured databases.

Below is an example, and the associated JSON output:

import json

with open("easter_rising_event.json", mode="w") as easter_rising_json_file:
    json.dump(encoder.encode(), easter_rising_json_file)
[
  {
    "Q193689": {
      "iso_639_1_language_code": "en",
      "id": "Q193689",
      "type": "EVENT",
      "label": "Easter Rising",
      "description": "an armed insurrection in Ireland during Easter Week, 1916",
      "names": [
        "1916 Rising",
        "Easter Rebellion",
        "Easter Rising"
      ],
      "processed_languages": [
        "de",
        "it",
        "es",
        "fr",
        "en"
      ],
      "entities_kept_if_mentioned_in_more_at_least_X_languages": 2.5,
      "start": "1916-04-24T00:00:00+00:00",
      "end": "1916-04-30T00:00:00+00:00",
      "entities": {
        "per": [
          {
            "id": "Q213374",
            "type": [
              "PERSON"
            ],
            "label": "James Connolly",
            "description": "James Connolly",
            "names": [
              "James Connolly"
            ],
            "count": 3
          },
          {
            "id": "Q274143",
            "type": [
              "PERSON"
            ],
            "label": "Patrick Pearse",
            "description": "Patrick Pearse",
            "names": [
              "Patrick Henry Pearse",
              "Patrick Pearse",
              "Padraig Pearse"
            ],
            "count": 3
          }
        ],
        "gpe": [
          {
            "id": "Q1761",
            "type": [
              "GPE"
            ],
            "label": "Dublin",
            "description": "Dublin",
            "names": [
              "City of Dublin",
              "Baile Átha Cliath",
              "Dublin",
              "Dublin city"
            ],
            "count": 8
          },
          {
            "id": "Q27",
            "type": [
              "GPE",
              "ORG"
            ],
            "label": "Ireland",
            "description": "Ireland",
            "names": [
              "🇮🇪",
              "Eire",
              "Éire",
              "Ireland",
              "ie",
              "ireland",
              "IRL",
              "Ireland, Republic of",
              "Republic of Ireland",
              "Hibernia",
              "IE",
              "Southern Ireland"
            ],
            "count": 5
          },
          {
            "id": "Q145",
            "type": [
              "GPE",
              "ORG"
            ],
            "label": "United Kingdom",
            "description": "United Kingdom",
            "names": [
              "G.B.",
              "GBR",
              "United Kingdom",
              "U. K.",
              "U K",
              "GB",
              "UK",
              "United Kingdom of Great Britain and Northern Ireland",
              "G B R",
              "Marea Britanie",
              "G. B. R.",
              "G B",
              "G. B.",
              "G.B.R.",
              "🇬🇧",
              "U.K.",
              "Great Britain"
            ],
            "count": 3
          }
        ],
        "org": [
          {
            "id": "Q27",
            "type": [
              "GPE",
              "ORG"
            ],
            "label": "Ireland",
            "description": "Ireland",
            "names": [
              "🇮🇪",
              "Eire",
              "Éire",
              "Ireland",
              "ie",
              "ireland",
              "IRL",
              "Ireland, Republic of",
              "Republic of Ireland",
              "Hibernia",
              "IE",
              "Southern Ireland"
            ],
            "count": 5
          },
          {
            "id": "Q1074958",
            "type": [
              "ORG"
            ],
            "label": "Irish Volunteers",
            "description": "Irish Volunteers",
            "names": [
              "Irish Volunteers",
              "Irish Volunteer Army",
              "Irish Volunteer Force"
            ],
            "count": 3
          },
          {
            "id": "Q145",
            "type": [
              "GPE",
              "ORG"
            ],
            "label": "United Kingdom",
            "description": "United Kingdom",
            "names": [
              "G.B.",
              "GBR",
              "United Kingdom",
              "U. K.",
              "U K",
              "GB",
              "UK",
              "United Kingdom of Great Britain and Northern Ireland",
              "G B R",
              "Marea Britanie",
              "G. B. R.",
              "G B",
              "G. B.",
              "G.B.R.",
              "🇬🇧",
              "U.K.",
              "Great Britain"
            ],
            "count": 3
          },
          {
            "id": "Q222595",
            "type": [
              "ORG"
            ],
            "label": "British Army",
            "description": "British Army",
            "names": [
              "British Army",
              "army of the United Kingdom"
            ],
            "count": 3
          },
          {
            "id": "Q1190570",
            "type": [
              "ORG"
            ],
            "label": "Irish Citizen Army",
            "description": "Irish Citizen Army",
            "names": [
              "Irish Citizen Army"
            ],
            "count": 3
          },
          {
            "id": "Q427496",
            "type": [
              "ORG"
            ],
            "label": "Cumann na mBan",
            "description": "Cumann na mBan",
            "names": [
              "Cumann na mBan",
              "CnamB",
              "The Irishwomen's Council"
            ],
            "count": 3
          }
        ]
      }
    }
  }
]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wikivents-1.0.3.tar.gz (22.7 kB view details)

Uploaded Source

Built Distributions

wikivents-1.0.3-py3.11.egg (84.8 kB view details)

Uploaded Source

wikivents-1.0.3-py3-none-any.whl (47.6 kB view details)

Uploaded Python 3

File details

Details for the file wikivents-1.0.3.tar.gz.

File metadata

  • Download URL: wikivents-1.0.3.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.2

File hashes

Hashes for wikivents-1.0.3.tar.gz
Algorithm Hash digest
SHA256 66c5cc2d6ef64df9b6e06ee989180952d1aff1efa7bef2afad2aaf371fe9b26a
MD5 9b837047cc3029da943d4db8614529d5
BLAKE2b-256 71e6ea65b9518a36648da982d9a448323473aa312fae2b6ac752f68592cce5e1

See more details on using hashes here.

File details

Details for the file wikivents-1.0.3-py3.11.egg.

File metadata

  • Download URL: wikivents-1.0.3-py3.11.egg
  • Upload date:
  • Size: 84.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for wikivents-1.0.3-py3.11.egg
Algorithm Hash digest
SHA256 ffb60227a5610bd0f1e24568c90a2fc8161a5bd99cdcdab7121690d3232d4b92
MD5 2541762ac4ee9229429ccbc22ead7a61
BLAKE2b-256 fecbb803f4c0fca5b1a83671a68293b33d9d0e10e045c54db5ac51572bf9ee27

See more details on using hashes here.

File details

Details for the file wikivents-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: wikivents-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 47.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.2

File hashes

Hashes for wikivents-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7a539e2bb153aeb3fb2bdb21860f05e4e570af8412f46e039c0442d219203eba
MD5 ab52fc550d7f9b7315c8ecee981120fc
BLAKE2b-256 ab81a59d16a66ad54cfbe03aa3367455788ad15584f4570030c3f17fd0ba6b71

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page