Skip to main content

First aid utilies for knowledge graph exploration with an entity centric approach

Project description

forayer logo

forayer

Tests Linting Test coverage Stable python versions MIT License Code style: black

About

Forayer is a library of first aid utilities for knowledge graph exploration with an entity centric approach. It is intended to make data integration of knowledge graphs easier. With entities as first class citizens forayer is a toolset to aid in knowledge graph exploration for data integration and specifically entity resolution.

You can easily load pre-existing entity resolution tasks:

  >>> from forayer.datasets import OpenEADataset
  >>> ds = OpenEADataset(ds_pair="D_W",size="15K",version=1)
  >>> ds.er_task
  ERTask({DBpedia: (# entities: 15000, # entities_with_rel: 15000, # rel: 13359,
  # entities_with_attributes: 13782, # attributes: 13782, # attr_values: 24995),
  Wikidata: (# entities: 15000, # entities_with_rel: 15000, # rel: 13554,
  # entities_with_attributes: 14376, # attributes: 14376, # attr_values: 114107)},
  ClusterHelper(# elements:30000, # clusters:15000))

This entity resolution task holds 2 knowledge graphs and a cluster of known matches. You can search in knowledge graphs:

  >>> ds.er_task["DBpedia"].search("Dorothea")
  KG(entities={'http://dbpedia.org/resource/E801200': 
  {'http://dbpedia.org/ontology/activeYearsStartYear': '"1948"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/activeYearsEndYear': '"2008"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/birthName': 'Dorothea Carothers Allen',
  'http://dbpedia.org/ontology/alias': 'Allen, Dorothea Carothers',
  'http://dbpedia.org/ontology/birthYear': '"1923"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://purl.org/dc/elements/1.1/description': 'Film editor',
  'http://dbpedia.org/ontology/birthDate': '"1923-12-03"^^<http://www.w3.org/2001/XMLSchema#date>',
  'http://dbpedia.org/ontology/deathDate': '"2010-04-17"^^<http://www.w3.org/2001/XMLSchema#date>', 
  'http://dbpedia.org/ontology/deathYear': '"2010"^^<http://www.w3.org/2001/XMLSchema#gYear>'}}, rel={}, name=DBpedia)

Decide to work with a smaller snippet of the resolution task:

  >>> ert_sample = ds.er_task.sample(100)
  >>> ert_sample
  ERTask({DBpedia: (# entities: 100, # entities_with_rel: 6, # rel: 4,
  # entities_with_attributes: 99, # attributes: 99, # attr_values: 274),
  Wikidata: (# entities: 100, # entities_with_rel: 4, # rel: 4,
  # entities_with_attributes: 100, # attributes: 100, # attr_values: 797)},
  ClusterHelper(# elements:200, # clusters:100))

And much more can be found in the user guide.

Installation

You can install forayer via pip:

  pip install forayer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forayer-0.4.2.tar.gz (31.8 kB view details)

Uploaded Source

Built Distribution

forayer-0.4.2-py3-none-any.whl (37.0 kB view details)

Uploaded Python 3

File details

Details for the file forayer-0.4.2.tar.gz.

File metadata

  • Download URL: forayer-0.4.2.tar.gz
  • Upload date:
  • Size: 31.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.13 Linux/5.13.0-1025-azure

File hashes

Hashes for forayer-0.4.2.tar.gz
Algorithm Hash digest
SHA256 f4c3da61fdf85f2428dbd32eb2eded504d4bce7cfefb96493e0ad313ff0adc83
MD5 5df6268a96d76a5dc45ba25cfc2235b7
BLAKE2b-256 b4a75addb9dcf3a039a48bb8e953c142fb26eb24d27ce68aa726e5ef3b7a30f7

See more details on using hashes here.

File details

Details for the file forayer-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: forayer-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 37.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.13 Linux/5.13.0-1025-azure

File hashes

Hashes for forayer-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a8c30d5c37f82096670bca7c24940b6e98296986e9f7fd18f528149bfff21c37
MD5 62ac2824d4464368d18d3f3493bd0189
BLAKE2b-256 7d5d5e2d0d11e281db315121303eb0f860a755ee71a2aa703e13b963a200bc22

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page