Skip to main content

First aid utilies for knowledge graph exploration with an entity centric approach

Project description

forayer logo

forayer

Tests Linting Test coverage Stable python versions MIT License Code style: black

About

Forayer is a library of first aid utilities for knowledge graph exploration with an entity centric approach. It is intended to make data integration of knowledge graphs easier. With entities as first class citizens forayer is a toolset to aid in knowledge graph exploration for data integration and specifically entity resolution.

You can easily load pre-existing entity resolution tasks:

  >>> from forayer.datasets import OpenEADataset
  >>> ds = OpenEADataset(ds_pair="D_W",size="15K",version=1)
  >>> ds.er_task
  ERTask({DBpedia: (# entities: 15000, # entities_with_rel: 15000, # rel: 13359,
  # entities_with_attributes: 13782, # attributes: 13782, # attr_values: 24995),
  Wikidata: (# entities: 15000, # entities_with_rel: 15000, # rel: 13554,
  # entities_with_attributes: 14376, # attributes: 14376, # attr_values: 114107)},
  ClusterHelper(# elements:30000, # clusters:15000))

This entity resolution task holds 2 knowledge graphs and a cluster of known matches. You can search in knowledge graphs:

  >>> ds.er_task["DBpedia"].search("Dorothea")
  KG(entities={'http://dbpedia.org/resource/E801200': 
  {'http://dbpedia.org/ontology/activeYearsStartYear': '"1948"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/activeYearsEndYear': '"2008"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/birthName': 'Dorothea Carothers Allen',
  'http://dbpedia.org/ontology/alias': 'Allen, Dorothea Carothers',
  'http://dbpedia.org/ontology/birthYear': '"1923"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://purl.org/dc/elements/1.1/description': 'Film editor',
  'http://dbpedia.org/ontology/birthDate': '"1923-12-03"^^<http://www.w3.org/2001/XMLSchema#date>',
  'http://dbpedia.org/ontology/deathDate': '"2010-04-17"^^<http://www.w3.org/2001/XMLSchema#date>', 
  'http://dbpedia.org/ontology/deathYear': '"2010"^^<http://www.w3.org/2001/XMLSchema#gYear>'}}, rel={}, name=DBpedia)

Decide to work with a smaller snippet of the resolution task:

  >>> ert_sample = ds.er_task.sample(100)
  >>> ert_sample
  ERTask({DBpedia: (# entities: 100, # entities_with_rel: 6, # rel: 4,
  # entities_with_attributes: 99, # attributes: 99, # attr_values: 274),
  Wikidata: (# entities: 100, # entities_with_rel: 4, # rel: 4,
  # entities_with_attributes: 100, # attributes: 100, # attr_values: 797)},
  ClusterHelper(# elements:200, # clusters:100))

And much more can be found in the user guide.

Installation

You can install forayer via pip:

  pip install forayer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forayer-0.4.3.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

forayer-0.4.3-py3-none-any.whl (37.0 kB view details)

Uploaded Python 3

File details

Details for the file forayer-0.4.3.tar.gz.

File metadata

  • Download URL: forayer-0.4.3.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.14 Linux/5.15.0-1021-azure

File hashes

Hashes for forayer-0.4.3.tar.gz
Algorithm Hash digest
SHA256 3e35fbd21787bd56699290ff3255382b270bdf0e20321ec6f3cd2b29ad0fe0e6
MD5 25e27a27d515da4e68974c2ea003e0a0
BLAKE2b-256 0a43ff03870d512cc64a1ee8f66a1e18bf5b2c4c0b8beeacd9b2e9f9bb8f107b

See more details on using hashes here.

File details

Details for the file forayer-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: forayer-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 37.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.9.14 Linux/5.15.0-1021-azure

File hashes

Hashes for forayer-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f26effbbb0c70417a0d9a7ecb67129931d7a9d0081e6cd2912cecb7ab8bfa1b1
MD5 c3ef11cf84189b5ec31898075af407a1
BLAKE2b-256 3cc9642922a6af99e71b52534d72d05ae6c02c8c3f408b9512f5a43a095b6d9f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page