Library to process dumps of knowledge graphs (Wikipedia, DBpedia, Wikidata)
Project description
kgdata

KGData is a library to process dumps of Wikipedia, Wikidata. What it can do:
- Clean up the dumps to ensure the data is consistent (resolve redirect, remove dangling references)
- Create embedded key-value databases to access entities from the dumps.
- Extract Wikidata ontology.
- Extract Wikipedia tables and convert the hyperlinks to Wikidata entities.
- Create Pyserini indices to search Wikidata’s entities.
- and more
For a full documentation, please see the website.
Installation
From PyPI (using pre-built binaries):
pip install kgdata[spark] # omit spark to manually specify its version if your cluster has different version
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kgdata-7.0.8.tar.gz.
File metadata
- Download URL: kgdata-7.0.8.tar.gz
- Upload date:
- Size: 162.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d63823819cf7dd59caa20be9d5469a4ff6e537f5f99ae9f50caac81c45387a0
|
|
| MD5 |
d2c3492b35273db9b752a0012ca0405e
|
|
| BLAKE2b-256 |
6d530afa82ec7b9225135b78d665b336488db99296af7b525d8dbf92c1fbef0c
|
File details
Details for the file kgdata-7.0.8-py3-none-any.whl.
File metadata
- Download URL: kgdata-7.0.8-py3-none-any.whl
- Upload date:
- Size: 171.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81f33e4b88d41a5899506c4a8ecf5bd02466bc364b5096a4a9d8843ea649d176
|
|
| MD5 |
76756a9e2b955429e739814f7dd115a8
|
|
| BLAKE2b-256 |
8f2585e139147d6a3cbf653ce12b2114cbe73666414e54e410fcc002c9f20a31
|
File details
Details for the file kgdata-7.0.8-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02c651be79ecfac26e959a0167847de61065be41dc55a8af8c22862c376f516c
|
|
| MD5 |
b63400f77000af4724721a6bb3ce2871
|
|
| BLAKE2b-256 |
47da24455b114927a4df896af3ef3f7534dd901bfac48d859a1f552b2efc85ff
|
File details
Details for the file kgdata-7.0.8-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a39b43c8a47b90cfd313c430d2074262b9795c5c5f94e79f87a44016afe3ace
|
|
| MD5 |
e1f4a3ddd2e8cc49671fb954ac7261d5
|
|
| BLAKE2b-256 |
f7ea5dcbd1cab6033e730302d5db65519a531f9308f4882a684588f44e44204a
|
File details
Details for the file kgdata-7.0.8-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.13t, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01dac4772d73bafc793dc048924a9065bd9aab9a24bd7e227341ef5c168b013e
|
|
| MD5 |
3cca1c8c1390b48acd7fac21cfb7d3a2
|
|
| BLAKE2b-256 |
d13f61f09c60d57294d1235bb923558863da680275adcce52fbf477399048852
|
File details
Details for the file kgdata-7.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8090181a63b5c980822104bf3722adc7948a56948241f9ea80e26334ca78c3f
|
|
| MD5 |
0beac213a7524a73c6414a1e0b809405
|
|
| BLAKE2b-256 |
27353d7d58ead2ecf3aa3f25ee71748b33c10889ad83d6916ec27b3e8030e057
|
File details
Details for the file kgdata-7.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8180ef70362f3ca9083b2faf7e6eafca8dcfd80af9f28515e107532104a02403
|
|
| MD5 |
566b45f0f53aa079f620068dc5a95be4
|
|
| BLAKE2b-256 |
e08cade0855f11025d37ce8dc957e1f526ec4bcf061fd03cc1c80dca758b6f83
|
File details
Details for the file kgdata-7.0.8-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 2.3 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16f4c653827ae4d596fda2b58269e9b83aecf4e004e028a4eaa18452ec77bba8
|
|
| MD5 |
34c440be318d6f7c4b6f0552aaa50e34
|
|
| BLAKE2b-256 |
85ca51115afc50516ac3473182f420d567040b0b59b84ce25daebdd634697365
|
File details
Details for the file kgdata-7.0.8-cp310-cp310-manylinux_2_35_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp310-cp310-manylinux_2_35_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.10, manylinux: glibc 2.35+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6aa0c38d2a0651f0a2d4eedb1150b5449dfacdc78108a95d5706589cb7be614f
|
|
| MD5 |
f5118bd86747b89e6a91a075c40d5ce2
|
|
| BLAKE2b-256 |
f7c73e5dc0790e39e3daebb85b5c53b07dbfb77b50f150b0b038adf8b505942f
|
File details
Details for the file kgdata-7.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
738bc2d869a4069208048ebe20d36b0e006c0d601e3ef94efeb3833b197a7bdd
|
|
| MD5 |
b574b04a3c27cecb438f7404be1ec362
|
|
| BLAKE2b-256 |
589f842436e00e161f194aac0ac03fde078ff3120fd71b8929e40e9329705685
|
File details
Details for the file kgdata-7.0.8-cp310-cp310-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl.
File metadata
- Download URL: kgdata-7.0.8-cp310-cp310-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl
- Upload date:
- Size: 5.4 MB
- Tags: CPython 3.10, macOS 10.14+ universal2 (ARM64, x86-64), macOS 10.14+ x86-64, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f79f52aa159a80be7b8c25a19a318f68fee18eae3b5afcab66c40cdbf006b51c
|
|
| MD5 |
6dca9980dca34f880da50683a8ed250d
|
|
| BLAKE2b-256 |
b55ba6cb480937d90f11724d5b32ae35cf060a316a4d4200cbc45e0ee13aac56
|
File details
Details for the file kgdata-7.0.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: kgdata-7.0.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c62a5812730302bb3f21774cab79737e3327f50d9553fdaf7cc83ef389cabb1
|
|
| MD5 |
3da909cfb8bbbdc8e1e2a3a773707c34
|
|
| BLAKE2b-256 |
98415d3136f044ced3ab6b6ae939db4d30c5bd65342584432727caa8db92d6c6
|