A simple Python package to represent events from Wikipedia and Wikidata resources.
Project description
wikivents
This Python package is used to represent events based on both ontologies and semi-structured databases. Up to now, only Wikidata and Wikipedia are implemented and provide these data. The name of this package, wikivents
show this legacy.
Events
An event is an ontology entity. For Wikidata, an event is an instance of an occurrence (Q1190554) or of an event (Q1656682).
A event is defined by few characteristics:
- A type: on Wikidata, this defines the type of the event. For instance, rebellion (Q124734) on Wikidata. This information is provided by the
wd:P31
orrdf:type
, depending on the source. - Date of occurrence, converted if necessary in the Gregorian calendar.
- Location where the event happened.
- The entities related to the event. The extraction process is based on semi-strucuted databases, which are processed in search of entities (people involved, places, etc). Entities are counted. This index shows the relevance of the entity in relation to the event. We assume the entities found in multiple lead sections are important entities in the event description.
Usage
The process of gathering data from multiple APIs (ontologies and semi-structured databases) can take a long time. We implemented a cache which speeds up queries and prevents from querying twice the same data.
Up to now, the only implemented toolchain is based on Wikimedia foundation projects: Wikidata for the ontology and Wikipedia articles for semi-structured databases. Then it is possible to get the representation of an event using the WikimediaFactory
object, as in the example below:
from wikivents.factories.wikimedia import WikimediaFactory
from wikivents.models import EntityId
easter_rising_entity_id = EntityId("Q193689")
easter_rising_event = WikimediaFactory.create_event(easter_rising_entity_id)
The previous code takes the Easter Rising event entity id (Q193689
)' as input. If using another Factory
which processes other ontologies, the entry point will obviously not be the same.
From the easter_rising_event
object, it is possible to access multiple attributes:
- The event identifier, which comes from the ontology itself:
easter_rising_event.id
Out[5]: 'Q193689'
- The event label, in any possible language. If unavailable, will return an empty string.
easter_rising_event.label("fr")
Out[7]: 'Insurrection de Pâques 1916'
- The event alternative names, which means all the names that are used in a specific language to speak about the event.
easter_rising_event.names("de")
Out[8]: {'Easter Rising', 'Irischer Osteraufstand 1916', 'Osteraufstand'}
- The event description, in plain text.
easter_rising_event.description("en")
Out[9]: 'an armed insurrection in Ireland during Easter Week, 1916'
- The event boundary dates, beginning and end.
easter_rising_event.beginning, easter_rising_event.end
Out[12]: (datetime.datetime(1916, 4, 24, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(1916, 4, 30, 0, 0, tzinfo=datetime.timezone.utc))
- The entities involved, accessible using the
gpe
property for GPE,org
property for ORG andper
property for PER entities.
easter_rising_event.gpe
Out[13]: [ParticipatingEntity(entity=<Entity(Q1761, Dublin)>, count=8),ParticipatingEntity(entity=<Entity(Q27, Ireland)>, count=5)]
Encode events in reusable formats
The wikivents
library also provides encoders which are used to transform the event object into other formats, easier to manipulate. From now on, we provide a DictEncoder
which make it easy to create a JSON file from it.
from wikivents.model_encoders import EventToDictEncoder
from wikivents.models import ISO6391LanguageCode
encoder = EventToDictEncoder(easter_rising_event, ISO6391LanguageCode("en"))
encoder.encode()
The purpose of encoders is to provide all the possible knowledge acquired about events and to show involved entities. It is also possible to add a parameter to the encode()
method, participating_entities_ratio_to_keep
to indicate which entities will be kept. It is a value comprised between 0 and 1. 1
means to keep all the participating entities, 0.5
to keep them only if they were found in at least 50% of the total number of processed semi-structured databases.
Below is an example, and the associated JSON output:
import json
with open("easter_rising_event.json", mode="w") as easter_rising_json_file:
json.dump(encoder.encode(), easter_rising_json_file)
[
{
"Q193689": {
"iso_639_1_language_code": "en",
"id": "Q193689",
"type": "EVENT",
"label": "Easter Rising",
"description": "an armed insurrection in Ireland during Easter Week, 1916",
"names": [
"1916 Rising",
"Easter Rebellion",
"Easter Rising"
],
"processed_languages": [
"de",
"it",
"es",
"fr",
"en"
],
"entities_kept_if_mentioned_in_more_at_least_X_languages": 2.5,
"start": "1916-04-24T00:00:00+00:00",
"end": "1916-04-30T00:00:00+00:00",
"entities": {
"per": [
{
"id": "Q213374",
"type": [
"PERSON"
],
"label": "James Connolly",
"description": "James Connolly",
"names": [
"James Connolly"
],
"count": 3
},
{
"id": "Q274143",
"type": [
"PERSON"
],
"label": "Patrick Pearse",
"description": "Patrick Pearse",
"names": [
"Patrick Henry Pearse",
"Patrick Pearse",
"Padraig Pearse"
],
"count": 3
}
],
"gpe": [
{
"id": "Q1761",
"type": [
"GPE"
],
"label": "Dublin",
"description": "Dublin",
"names": [
"City of Dublin",
"Baile Átha Cliath",
"Dublin",
"Dublin city"
],
"count": 8
},
{
"id": "Q27",
"type": [
"GPE",
"ORG"
],
"label": "Ireland",
"description": "Ireland",
"names": [
"🇮🇪",
"Eire",
"Éire",
"Ireland",
"ie",
"ireland",
"IRL",
"Ireland, Republic of",
"Republic of Ireland",
"Hibernia",
"IE",
"Southern Ireland"
],
"count": 5
},
{
"id": "Q145",
"type": [
"GPE",
"ORG"
],
"label": "United Kingdom",
"description": "United Kingdom",
"names": [
"G.B.",
"GBR",
"United Kingdom",
"U. K.",
"U K",
"GB",
"UK",
"United Kingdom of Great Britain and Northern Ireland",
"G B R",
"Marea Britanie",
"G. B. R.",
"G B",
"G. B.",
"G.B.R.",
"🇬🇧",
"U.K.",
"Great Britain"
],
"count": 3
}
],
"org": [
{
"id": "Q27",
"type": [
"GPE",
"ORG"
],
"label": "Ireland",
"description": "Ireland",
"names": [
"🇮🇪",
"Eire",
"Éire",
"Ireland",
"ie",
"ireland",
"IRL",
"Ireland, Republic of",
"Republic of Ireland",
"Hibernia",
"IE",
"Southern Ireland"
],
"count": 5
},
{
"id": "Q1074958",
"type": [
"ORG"
],
"label": "Irish Volunteers",
"description": "Irish Volunteers",
"names": [
"Irish Volunteers",
"Irish Volunteer Army",
"Irish Volunteer Force"
],
"count": 3
},
{
"id": "Q145",
"type": [
"GPE",
"ORG"
],
"label": "United Kingdom",
"description": "United Kingdom",
"names": [
"G.B.",
"GBR",
"United Kingdom",
"U. K.",
"U K",
"GB",
"UK",
"United Kingdom of Great Britain and Northern Ireland",
"G B R",
"Marea Britanie",
"G. B. R.",
"G B",
"G. B.",
"G.B.R.",
"🇬🇧",
"U.K.",
"Great Britain"
],
"count": 3
},
{
"id": "Q222595",
"type": [
"ORG"
],
"label": "British Army",
"description": "British Army",
"names": [
"British Army",
"army of the United Kingdom"
],
"count": 3
},
{
"id": "Q1190570",
"type": [
"ORG"
],
"label": "Irish Citizen Army",
"description": "Irish Citizen Army",
"names": [
"Irish Citizen Army"
],
"count": 3
},
{
"id": "Q427496",
"type": [
"ORG"
],
"label": "Cumann na mBan",
"description": "Cumann na mBan",
"names": [
"Cumann na mBan",
"CnamB",
"The Irishwomen's Council"
],
"count": 3
}
]
}
}
}
]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file wikivents-1.0.4.tar.gz
.
File metadata
- Download URL: wikivents-1.0.4.tar.gz
- Upload date:
- Size: 34.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c226a007fd3778fdd96bf32fa76956bf46d475001f031d674439144e64921294 |
|
MD5 | de105eb4d7b06c55fa8c9b211a64c0a0 |
|
BLAKE2b-256 | a7d0ac57383a42ecdf092a43acf1108110d4bccd8b4e757a2e6d6b19870666a0 |
File details
Details for the file wikivents-1.0.4-py3.11.egg
.
File metadata
- Download URL: wikivents-1.0.4-py3.11.egg
- Upload date:
- Size: 84.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5fb3eb1fd86e33eaa9b6d1f2d12ec4d8dd809454c609dd78aafa2099c587279 |
|
MD5 | 9909751b8dabe42095f8cbd613374595 |
|
BLAKE2b-256 | 6a63249a165f8ff1c06dfd5a19ac0ae23069babf85bfd783e2680e8ca81c2d78 |
File details
Details for the file wikivents-1.0.4-py3-none-any.whl
.
File metadata
- Download URL: wikivents-1.0.4-py3-none-any.whl
- Upload date:
- Size: 43.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd8a230f2a886214a4cd06bd51dd35de87ae4512eed43900d8dcb24efc599b8c |
|
MD5 | f2ce0f408770f72c4e234a13b6aeccfb |
|
BLAKE2b-256 | 019d77e7ef74c26da7a0ac84887e85c3903387c1ef8c9831ad2efd2575bd1e4a |