Skip to main content

First aid utilies for knowledge graph exploration with an entity centric approach

Project description

forayer logo

forayer

Tests Linting Test coverage Stable python versions MIT License Code style: black

About

Forayer is a library of first aid utilities for knowledge graph exploration with an entity centric approach. It is intended to make data integration of knowledge graphs easier. With entities as first class citizens forayer is a toolset to aid in knowledge graph exploration for data integration and specifically entity resolution.

You can easily load pre-existing entity resolution tasks:

  >>> from forayer.datasets import OpenEADataset
  >>> ds = OpenEADataset(ds_pair="D_W",size="15K",version=1)
  >>> ds.er_task
  ERTask({DBpedia: (# entities: 15000, # entities_with_rel: 15000, # rel: 13359,
  # entities_with_attributes: 13782, # attributes: 13782, # attr_values: 24995),
  Wikidata: (# entities: 15000, # entities_with_rel: 15000, # rel: 13554,
  # entities_with_attributes: 14376, # attributes: 14376, # attr_values: 114107)},
  ClusterHelper(# elements:30000, # clusters:15000))

This entity resolution task holds 2 knowledge graphs and a cluster of known matches. You can search in knowledge graphs:

  >>> ds.er_task["DBpedia"].search("Dorothea")
  KG(entities={'http://dbpedia.org/resource/E801200': 
  {'http://dbpedia.org/ontology/activeYearsStartYear': '"1948"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/activeYearsEndYear': '"2008"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://dbpedia.org/ontology/birthName': 'Dorothea Carothers Allen',
  'http://dbpedia.org/ontology/alias': 'Allen, Dorothea Carothers',
  'http://dbpedia.org/ontology/birthYear': '"1923"^^<http://www.w3.org/2001/XMLSchema#gYear>',
  'http://purl.org/dc/elements/1.1/description': 'Film editor',
  'http://dbpedia.org/ontology/birthDate': '"1923-12-03"^^<http://www.w3.org/2001/XMLSchema#date>',
  'http://dbpedia.org/ontology/deathDate': '"2010-04-17"^^<http://www.w3.org/2001/XMLSchema#date>', 
  'http://dbpedia.org/ontology/deathYear': '"2010"^^<http://www.w3.org/2001/XMLSchema#gYear>'}}, rel={}, name=DBpedia)

Decide to work with a smaller snippet of the resolution task:

  >>> ert_sample = ds.er_task.sample(100)
  >>> ert_sample
  ERTask({DBpedia: (# entities: 100, # entities_with_rel: 6, # rel: 4,
  # entities_with_attributes: 99, # attributes: 99, # attr_values: 274),
  Wikidata: (# entities: 100, # entities_with_rel: 4, # rel: 4,
  # entities_with_attributes: 100, # attributes: 100, # attr_values: 797)},
  ClusterHelper(# elements:200, # clusters:100))

And much more can be found in the user guide.

Installation

You can install forayer via pip:

  pip install forayer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forayer-0.3.2.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

forayer-0.3.2-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file forayer-0.3.2.tar.gz.

File metadata

  • Download URL: forayer-0.3.2.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.9.7 Linux/5.8.0-1042-azure

File hashes

Hashes for forayer-0.3.2.tar.gz
Algorithm Hash digest
SHA256 931dbf72d4f31c35a77c877348634464be21659041282e18da85f577ae92b46a
MD5 2d015441692a16073ccefa97cc47ad06
BLAKE2b-256 f2f16f7acd5505315afaaf554fa391ecb8c8abee52f5da9034f7eaf45b4643df

See more details on using hashes here.

File details

Details for the file forayer-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: forayer-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 36.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.9.7 Linux/5.8.0-1042-azure

File hashes

Hashes for forayer-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b203291b183777d36dc5109c1f2772ed8a454a5b4ad4f84d6c1e8774b733e43a
MD5 a664a8c3b822a964104844277746277c
BLAKE2b-256 d68de5e38a783e92a0a8c8db6355c351346d36728d23f8454ca115d81eac80ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page