Load fully-typed information extraction data in a single line.
Project description
Information Extraction Datasets
This package takes care of all of the tedium when loading various information extraction datasets, providing the data in fully validated and typed Pydantic objects.
Datasets
BioRED
Example
from ie_datasets import BioRED
BioRED.load_units("Train")
BioRED.load_units("Dev")
BioRED.load_units("Test")
ChemProt
Example
from ie_datasets import ChemProt
ChemProt.load_units("train")
ChemProt.load_units("validation")
ChemProt.load_units("test")
CrossRE
Example
from ie_datasets import CrossRE
for domain in ("ai", "literature", "music", "news", "politics", "science"):
CrossRE.load_units("train")
CrossRE.load_units("dev")
CrossRE.load_units("test")
CUAD
Example
from ie_datasets import CUAD
CUAD.load_units()
DocRED
Example
from ie_datasets import DocRED
DocRED.load_schema()
DocRED.load_units("train_annotated")
DocRED.load_units("train_distant")
DocRED.load_units("validation")
DocRED.load_units("test")
NOTE: DocRED has been superseded by Re-DocRED
HyperRED
Example
from ie_datasets import HyperRED
HyperRED.load_units("train")
HyperRED.load_units("validation")
HyperRED.load_units("test")
KnowledgeNet
Example
from ie_datasets import KnowledgeNet
KnowledgeNet.load_units("train")
KnowledgeNet.load_units("test-no-facts") # unlabelled
SciERC
Example
from ie_datasets import SciERC
SciERC.load_units("train")
SciERC.load_units("dev")
SciERC.load_units("test")
SoMeSci
Example
from ie_datasets import SoMeSci
SoMeSci.load_schema()
for group in ("Creation_sentences", "PLoS_methods", "PLoS_sentences", "Pubmed_fulltext"):
SoMeSci.load_units(group=group, split="train")
SoMeSci.load_units(group=group, split="devel")
SoMeSci.load_units(group=group, split="test")
Re-DocRED
Example
from ie_datasets import ReDocRED
ReDocRED.load_schema()
ReDocRED.load_units("train")
ReDocRED.load_units("validation")
ReDocRED.load_units("test")
TPLinker/NYT
Example
from ie_datasets import TPLinkerNYT
TPLinkerNYT.load_schema()
TPLinkerNYT.load_units("train")
TPLinkerNYT.load_units("valid")
TPLinkerNYT.load_units("test")
TPLinker/WebNLG
Example
from ie_datasets import TPLinkerWebNLG
TPLinkerWebNLG.load_schema()
TPLinkerWebNLG.load_units("train")
TPLinkerWebNLG.load_units("valid")
TPLinkerWebNLG.load_units("test")
WikiEvents
Example
from ie_datasets import WikiEvents
WikiEvents.load_ontology()
WikiEvents.load_units("train")
WikiEvents.load_units("dev")
WikiEvents.load_units("test")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ie_datasets-0.0.4.tar.gz
(22.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ie_datasets-0.0.4.tar.gz.
File metadata
- Download URL: ie_datasets-0.0.4.tar.gz
- Upload date:
- Size: 22.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f80ce129ca5711fa6799e32108f97cb14815588ee67b1fedad6f01c13eba807e
|
|
| MD5 |
c43b5d2ffb9bc28ad6218a56c9cef3cc
|
|
| BLAKE2b-256 |
e5b2a04ab5b7471a3014ecb7e065f18e57a43fe6d1159da951cb5cc2b3335af3
|
File details
Details for the file ie_datasets-0.0.4-py3-none-any.whl.
File metadata
- Download URL: ie_datasets-0.0.4-py3-none-any.whl
- Upload date:
- Size: 49.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02c1ba600fc7746ab095295d2daefccbea4eeaaa52f68b5a05b8b6a0f123ab1d
|
|
| MD5 |
786fc6cf24b2c55cffe0bb141c71184f
|
|
| BLAKE2b-256 |
4d1e265d772ef3896e0b7f7f06fe4450d941b161fc6436fcbe85f8aedcec9ff4
|