Load fully-typed information extraction data in a single line.
Project description
Information Extraction Datasets
This package takes care of all of the tedium when loading various information extraction datasets, providing the data in fully validated and typed Pydantic objects.
Datasets
BioRED
Example
from ie_datasets import BioRED
BioRED.load_units(BioRED.Split.TRAIN)
ChemProt
Example
from ie_datasets import ChemProt
ChemProt.load_units(ChemProt.Split.TRAIN)
CrossRE
Example
from ie_datasets import CrossRE
CrossRE.load_units(CrossRE.Split.TRAIN, domain=CrossRE.Domain.AI)
CUAD
Example
from ie_datasets import CUAD
CUAD.load_units()
DEFT
Example
from ie_datasets import DEFT
DEFT.load_units(DEFT.Split.TRAIN, category=DEFT.Category.BIOLOGY)
NOTE: DEFT's data files contain an overwhelming number of errata. For now, we drop the errors instead of fixing them. This means that we are loading a subset of DEFT, not the full dataset.
DocRED
Example
from ie_datasets import DocRED
DocRED.load_schema()
DocRED.load_units(DocRED.Split.TRAIN_ANNOTATED)
NOTE: DocRED has been superseded by Re-DocRED
HyperRED
Example
from ie_datasets import HyperRED
HyperRED.load_units(HyperRED.Split.TRAIN)
KnowledgeNet
Example
from ie_datasets import KnowledgeNet
KnowledgeNet.load_units(KnowledgeNet.Split.TRAIN)
NOTE: The test split of KnowledgeNet is unlabelled.
Re-DocRED
Example
from ie_datasets import ReDocRED
ReDocRED.load_schema()
ReDocRED.load_units(ReDocRED.Split.TRAIN)
SciERC
Example
from ie_datasets import SciERC
SciERC.load_units(SciERC.Split.TRAIN)
SciREX
Example
from ie_datasets import SciREX
SciREX.load_units(SciREX.Split.TRAIN)
SoMeSci
Example
from ie_datasets import SoMeSci
SoMeSci.load_schema()
SoMeSci.load_units(SoMeSci.Split.TRAIN, group=SoMeSci.Group.CREATION_SENTENCES)
TPLinker/NYT
Example
from ie_datasets import TPLinkerNYT
TPLinkerNYT.load_schema()
TPLinkerNYT.load_units(TPLinkerNYT.Split.TRAIN)
TPLinker/WebNLG
Example
from ie_datasets import TPLinkerWebNLG
TPLinkerWebNLG.load_schema()
TPLinkerWebNLG.load_units(TPLinkerWebNLG.Split.TRAIN)
WikiEvents
Example
from ie_datasets import WikiEvents
WikiEvents.load_ontology()
WikiEvents.load_units(WikiEvents.Split.TRAIN)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ie_datasets-0.0.7.tar.gz.
File metadata
- Download URL: ie_datasets-0.0.7.tar.gz
- Upload date:
- Size: 529.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3626688f60ebe20bb0d2f2851bdbddbe65677093fdfad9bab39fbd3f5cbb7e77
|
|
| MD5 |
f50b59b3db74da63872dcf5d873a6d41
|
|
| BLAKE2b-256 |
bcc298f82dfd64c3de6f2b4213460b477b8f6036c9bcb9d16f0462ee5ee8d3ee
|
File details
Details for the file ie_datasets-0.0.7-py3-none-any.whl.
File metadata
- Download URL: ie_datasets-0.0.7-py3-none-any.whl
- Upload date:
- Size: 76.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab09137909e49092b1708da9b6721177eb50315275aa174d747ea7f90f6a8f2a
|
|
| MD5 |
a813f3da3a705239f6208fb06a5631b5
|
|
| BLAKE2b-256 |
ef23aa4990621eb434202cbacf6fb6f52059d90913e999e62df5d361474473aa
|