Load fully-typed information extraction data in a single line.

These details have not been verified by PyPI

Project links

Project description

Information Extraction Datasets

This package takes care of all of the tedium when loading various information extraction datasets, providing the data in fully validated and typed Pydantic objects.

Datasets

BioRED

Example

from ie_datasets import BioRED
BioRED.load_units(BioRED.Split.TRAIN)

ChemProt

Example

from ie_datasets import ChemProt
ChemProt.load_units(ChemProt.Split.TRAIN)

CrossRE

Example

from ie_datasets import CrossRE
CrossRE.load_units(CrossRE.Split.TRAIN, domain=CrossRE.Domain.AI)

CUAD

Example

from ie_datasets import CUAD
CUAD.load_units()

DEFT

Example

from ie_datasets import DEFT
DEFT.load_units(DEFT.Split.TRAIN, category=DEFT.Category.BIOLOGY)

NOTE: DEFT's data files contain an overwhelming number of errata. For now, we drop the errors instead of fixing them. This means that we are loading a subset of DEFT, not the full dataset.

DocRED

Example

from ie_datasets import DocRED
DocRED.load_schema()
DocRED.load_units(DocRED.Split.TRAIN_ANNOTATED)

NOTE: DocRED has been superseded by Re-DocRED

HyperRED

Example

from ie_datasets import HyperRED
HyperRED.load_units(HyperRED.Split.TRAIN)

KnowledgeNet

Example

from ie_datasets import KnowledgeNet
KnowledgeNet.load_units(KnowledgeNet.Split.TRAIN)

NOTE: The test split of KnowledgeNet is unlabelled.

Re-DocRED

Example

from ie_datasets import ReDocRED
ReDocRED.load_schema()
ReDocRED.load_units(ReDocRED.Split.TRAIN)

SciERC

Example

from ie_datasets import SciERC
SciERC.load_units(SciERC.Split.TRAIN)

SciREX

Example

from ie_datasets import SciREX
SciREX.load_units(SciREX.Split.TRAIN)

SoMeSci

Example

from ie_datasets import SoMeSci
SoMeSci.load_schema()
SoMeSci.load_units(SoMeSci.Split.TRAIN, group=SoMeSci.Group.CREATION_SENTENCES)

TPLinker/NYT

Example

from ie_datasets import TPLinkerNYT
TPLinkerNYT.load_schema()
TPLinkerNYT.load_units(TPLinkerNYT.Split.TRAIN)

TPLinker/WebNLG

Example

from ie_datasets import TPLinkerWebNLG
TPLinkerWebNLG.load_schema()
TPLinkerWebNLG.load_units(TPLinkerWebNLG.Split.TRAIN)

WikiEvents

Example

from ie_datasets import WikiEvents
WikiEvents.load_ontology()
WikiEvents.load_units(WikiEvents.Split.TRAIN)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.8

Mar 19, 2025

0.0.7

Mar 6, 2025

This version

0.0.6

Mar 5, 2025

0.0.5

Mar 4, 2025

0.0.4

Feb 27, 2025

0.0.3

Feb 25, 2025

0.0.2

Feb 25, 2025

0.0.1 yanked

Feb 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ie_datasets-0.0.6.tar.gz (518.9 kB view details)

Uploaded Mar 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ie_datasets-0.0.6-py3-none-any.whl (74.0 kB view details)

Uploaded Mar 5, 2025 Python 3

File details

Details for the file ie_datasets-0.0.6.tar.gz.

File metadata

Download URL: ie_datasets-0.0.6.tar.gz
Upload date: Mar 5, 2025
Size: 518.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ie_datasets-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`236129a21e702878fc855cfcc19dbc60b6441e8c8b8c2b6a25c16cd511c6aa04`
MD5	`b13b454ee6ef02340a5ff7227ab8eb9c`
BLAKE2b-256	`8caaefd237cdab4b7f982b4336b10b0b5879ef0a693ba9429cae601405594c42`

See more details on using hashes here.

File details

Details for the file ie_datasets-0.0.6-py3-none-any.whl.

File metadata

Download URL: ie_datasets-0.0.6-py3-none-any.whl
Upload date: Mar 5, 2025
Size: 74.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ie_datasets-0.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e7c346c56ad448d30b1700143ac9037e3a5a1c727c5309788a490f1cf6ec7af`
MD5	`ebd2b13f1f5630cf9ccbb7e9205a0333`
BLAKE2b-256	`e1f38c66ffb6681d02901064be5a1d76e562de41d90db8447f8a1171c9a69480`

See more details on using hashes here.

ie-datasets 0.0.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Information Extraction Datasets

Datasets

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes