Skip to main content

Library to process dumps of knowledge graphs (Wikipedia, DBpedia, Wikidata)

Project description

kgdata PyPI Documentation

KGData is a library to process dumps of Wikipedia, Wikidata. What it can do:

  • Clean up the dumps to ensure the data is consistent (resolve redirect, remove dangling references)
  • Create embedded key-value databases to access entities from the dumps.
  • Extract Wikidata ontology.
  • Extract Wikipedia tables and convert the hyperlinks to Wikidata entities.
  • Create Pyserini indices to search Wikidata’s entities.
  • and more

For a full documentation, please see the website.

Installation

From PyPI (using pre-built binaries):

pip install kgdata[spark]   # omit spark to manually specify its version if your cluster has different version

Project details


Release history Release notifications | RSS feed

This version

7.0.9

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kgdata-7.0.9.tar.gz (111.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kgdata-7.0.9-py3-none-any.whl (171.6 kB view details)

Uploaded Python 3

File details

Details for the file kgdata-7.0.9.tar.gz.

File metadata

  • Download URL: kgdata-7.0.9.tar.gz
  • Upload date:
  • Size: 111.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for kgdata-7.0.9.tar.gz
Algorithm Hash digest
SHA256 b88e831d7f80f8a5569d1c4dcd528eeac1dd15486f010b462307e0d30675b48e
MD5 c274c42e00b6d2075a7f88796bc461b5
BLAKE2b-256 9b26eae3233e3fa7c936b919a6ec4af300cd973b7052f95c9a3ee322c907dfd5

See more details on using hashes here.

File details

Details for the file kgdata-7.0.9-py3-none-any.whl.

File metadata

  • Download URL: kgdata-7.0.9-py3-none-any.whl
  • Upload date:
  • Size: 171.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for kgdata-7.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 da6591f9afa63ee00c12bb30f85f55134640c749fc2ef493a4c6e8860d416fd7
MD5 fb5819130f1d0fc0152f3be57da2e3cd
BLAKE2b-256 ba4eda31538e4ce12f892b81196588b0c480e1bf99e1d497e8ecb3575f69133e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page