Library to process dumps of knowledge graphs (Wikipedia, DBpedia, Wikidata)
Project description
kgdata

KGData is a library to process dumps of Wikipedia, Wikidata. What it can do:
- Clean up the dumps to ensure the data is consistent (resolve redirect, remove dangling references)
- Create embedded key-value databases to access entities from the dumps.
- Extract Wikidata ontology.
- Extract Wikipedia tables and convert the hyperlinks to Wikidata entities.
- Create Pyserini indices to search Wikidata’s entities.
- and more
For a full documentation, please see the website.
Installation
From PyPI (using pre-built binaries):
pip install kgdata[spark] # omit spark to manually specify its version if your cluster has different version
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kgdata-7.0.12.tar.gz
(112.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
kgdata-7.0.12-py3-none-any.whl
(172.0 kB
view details)
File details
Details for the file kgdata-7.0.12.tar.gz.
File metadata
- Download URL: kgdata-7.0.12.tar.gz
- Upload date:
- Size: 112.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0a859e2510f0726b847876fa65d392b40353b0011c317e513c2c898b6b937be
|
|
| MD5 |
b1111eb17ec024e9332c5d92811b3774
|
|
| BLAKE2b-256 |
708428cd06534eea1bff1f1555145b9f057426b7bc3bbc457ab489724186ec1f
|
File details
Details for the file kgdata-7.0.12-py3-none-any.whl.
File metadata
- Download URL: kgdata-7.0.12-py3-none-any.whl
- Upload date:
- Size: 172.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05640418a3e9fa5ff72367dfc93bf6703ea77919df840d093a5ec0f9132b3a02
|
|
| MD5 |
084350d27bc0db0a422c46c2da77533a
|
|
| BLAKE2b-256 |
b6c29dead0170ca1f9089d4cf48d53d5ce9e3e104921d6e88fcc580979f9f74d
|