Skip to main content

First aid utilies for knowledge graph exploration with an entity centric approach

Project description

forayer logo

forayer

Tests Linting Test coverage Stable python versions MIT License Code style: black

About

Forayer is a library of first aid utilities for knowledge graph exploration with an entity centric approach. It is intended to make data integration of knowledge graphs easier. With entities as first class citizens forayer is a toolset to aid in knowledge graph exploration for data integration and specifically entity resolution.

You can easily load pre-existing entity resolution tasks:

  >>> from forayer.datasets import OpenEADataset
  >>> ds = OpenEADataset(ds_pair="D_W",size="15K",version=1)
  >>> ds.er_task
  ERTask({DBpedia: (# entities: 15000, # entities_with_rel: 15000, # rel: 13359,
  # entities_with_attributes: 13782, # attributes: 13782, # attr_values: 24995),
  Wikidata: (# entities: 15000, # entities_with_rel: 15000, # rel: 13554,
  # entities_with_attributes: 14376, # attributes: 14376, # attr_values: 114107)},
  ClusterHelper(# elements:30000, # clusters:15000))

This entity resolution task holds 2 knowledge graphs and a cluster of known matches. You can search in knowledge graphs:

  >>> ds.er_task["DBpedia"].search("Dorothea")
  KG(entities={'http://dbpedia.org/resource/E801200': 
  {'http://dbpedia.org/ontology/activeYearsStartYear': '"1948"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/activeYearsEndYear': '"2008"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/birthName': 'Dorothea Carothers Allen',
  'http://dbpedia.org/ontology/alias': 'Allen, Dorothea Carothers',
  'http://dbpedia.org/ontology/birthYear': '"1923"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://purl.org/dc/elements/1.1/description': 'Film editor',
  'http://dbpedia.org/ontology/birthDate': '"1923-12-03"^^<http://www.w3.org/2001/XMLSchema#date>',
  'http://dbpedia.org/ontology/deathDate': '"2010-04-17"^^<http://www.w3.org/2001/XMLSchema#date>', 
  'http://dbpedia.org/ontology/deathYear': '"2010"^^<http://www.w3.org/2001/XMLSchema#gYear>'}}, rel={}, name=DBpedia)

Decide to work with a smaller snippet of the resolution task:

  >>> ert_sample = ds.er_task.sample(100)
  >>> ert_sample
  ERTask({DBpedia: (# entities: 100, # entities_with_rel: 6, # rel: 4,
  # entities_with_attributes: 99, # attributes: 99, # attr_values: 274),
  Wikidata: (# entities: 100, # entities_with_rel: 4, # rel: 4,
  # entities_with_attributes: 100, # attributes: 100, # attr_values: 797)},
  ClusterHelper(# elements:200, # clusters:100))

And much more can be found in the user guide.

Installation

You can install forayer via pip:

  pip install forayer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forayer-0.3.0.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

forayer-0.3.0-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file forayer-0.3.0.tar.gz.

File metadata

  • Download URL: forayer-0.3.0.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.9 CPython/3.9.7 Linux/5.8.0-1041-azure

File hashes

Hashes for forayer-0.3.0.tar.gz
Algorithm Hash digest
SHA256 a725fe4768b6491c024d2fe9af835aa595807b818793ec4f4355fd72cd17129e
MD5 7a6a858a79d5ae368a7fff90f840f382
BLAKE2b-256 e5579ba2cfcd2d50fece761a3ae8848cafb2bc13dc6d9cc5c50753cb93d70efa

See more details on using hashes here.

File details

Details for the file forayer-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: forayer-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 36.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.9 CPython/3.9.7 Linux/5.8.0-1041-azure

File hashes

Hashes for forayer-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 540b80b77dddf8e2e382e59a07427243b352112b3438db1b7c184c6cd00ebf6f
MD5 26c2f97d04c26a34419ddd76d1608982
BLAKE2b-256 328b2ee17a7a0776ac88776a998cb6b7dfa33e7ec5470b62a310b23a9d7e741b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page