First aid utilies for knowledge graph exploration with an entity centric approach
Project description
forayer
About
Forayer is a library of first aid utilities for knowledge graph exploration with an entity centric approach. It is intended to make data integration of knowledge graphs easier. With entities as first class citizens forayer is a toolset to aid in knowledge graph exploration for data integration and specifically entity resolution.
You can easily load pre-existing entity resolution tasks:
>>> from forayer.datasets import OpenEADataset
>>> ds = OpenEADataset(ds_pair="D_W",size="15K",version=1)
>>> ds.er_task
ERTask({DBpedia: (# entities: 15000, # entities_with_rel: 15000, # rel: 13359,
# entities_with_attributes: 13782, # attributes: 13782, # attr_values: 24995),
Wikidata: (# entities: 15000, # entities_with_rel: 15000, # rel: 13554,
# entities_with_attributes: 14376, # attributes: 14376, # attr_values: 114107)},
ClusterHelper(# elements:30000, # clusters:15000))
This entity resolution task holds 2 knowledge graphs and a cluster of known matches. You can search in knowledge graphs:
>>> ds.er_task["DBpedia"].search("Dorothea")
KG(entities={'http://dbpedia.org/resource/E801200':
{'http://dbpedia.org/ontology/activeYearsStartYear': '"1948"^^<http://www.w3.org/2001/XMLSchema#gYear>',
'http://dbpedia.org/ontology/activeYearsEndYear': '"2008"^^<http://www.w3.org/2001/XMLSchema#gYear>',
'http://dbpedia.org/ontology/birthName': 'Dorothea Carothers Allen',
'http://dbpedia.org/ontology/alias': 'Allen, Dorothea Carothers',
'http://dbpedia.org/ontology/birthYear': '"1923"^^<http://www.w3.org/2001/XMLSchema#gYear>',
'http://purl.org/dc/elements/1.1/description': 'Film editor',
'http://dbpedia.org/ontology/birthDate': '"1923-12-03"^^<http://www.w3.org/2001/XMLSchema#date>',
'http://dbpedia.org/ontology/deathDate': '"2010-04-17"^^<http://www.w3.org/2001/XMLSchema#date>',
'http://dbpedia.org/ontology/deathYear': '"2010"^^<http://www.w3.org/2001/XMLSchema#gYear>'}}, rel={}, name=DBpedia)
Decide to work with a smaller snippet of the resolution task:
>>> ert_sample = ds.er_task.sample(100)
>>> ert_sample
ERTask({DBpedia: (# entities: 100, # entities_with_rel: 6, # rel: 4,
# entities_with_attributes: 99, # attributes: 99, # attr_values: 274),
Wikidata: (# entities: 100, # entities_with_rel: 4, # rel: 4,
# entities_with_attributes: 100, # attributes: 100, # attr_values: 797)},
ClusterHelper(# elements:200, # clusters:100))
And much more can be found in the user guide.
Installation
You can install forayer via pip:
pip install forayer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file forayer-0.4.4.tar.gz
.
File metadata
- Download URL: forayer-0.4.4.tar.gz
- Upload date:
- Size: 32.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.10.6 Linux/5.19.0-35-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4ae8202d10fbf8b55a7ed0cedcf10984f6cd8ac3dbe34c4eb071ef363f0fd0f |
|
MD5 | 0edbfa810616a6fc48406b8c4d57a1a2 |
|
BLAKE2b-256 | dbae196afe615e2a98f28a20b424951c2b6b866751016e1087a1d703613c5efb |
File details
Details for the file forayer-0.4.4-py3-none-any.whl
.
File metadata
- Download URL: forayer-0.4.4-py3-none-any.whl
- Upload date:
- Size: 37.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.10.6 Linux/5.19.0-35-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 947368aba600d3875747ac33611dd2b1445c71ed60bfc1af5681ad213b367ce3 |
|
MD5 | 3f159d3bea9b3521859a75fc859217e4 |
|
BLAKE2b-256 | 6757a4b802464a5863c925a028c1bfb42c3c961926ab8fd524182aab7310b1d2 |