Skip to main content

Load fully-typed information extraction data in a single line.

Project description

Information Extraction Datasets

This package takes care of all of the tedium when loading various information extraction datasets, providing the data in fully validated and typed Pydantic objects.

Datasets

BioRED

Example
from ie_datasets import BioRED
BioRED.load_units("Train")
BioRED.load_units("Dev")
BioRED.load_units("Test")

ChemProt

Example
from ie_datasets import ChemProt
ChemProt.load_units("train")
ChemProt.load_units("validation")
ChemProt.load_units("test")

CrossRE

Example
from ie_datasets import CrossRE
for domain in ("ai", "literature", "music", "news", "politics", "science"):
    CrossRE.load_units("train")
    CrossRE.load_units("dev")
    CrossRE.load_units("test")

CUAD

Example
from ie_datasets import CUAD
CUAD.load_units()

DocRED

Example
from ie_datasets import DocRED
DocRED.load_schema()
DocRED.load_units("train_annotated")
DocRED.load_units("train_distant")
DocRED.load_units("validation")
DocRED.load_units("test")

NOTE: DocRED has been superseded by Re-DocRED

HyperRED

Example
from ie_datasets import HyperRED
HyperRED.load_units("train")
HyperRED.load_units("validation")
HyperRED.load_units("test")

KnowledgeNet

Example
from ie_datasets import KnowledgeNet
KnowledgeNet.load_units("train")
KnowledgeNet.load_units("test-no-facts") # unlabelled

SciERC

Example
from ie_datasets import SciERC
SciERC.load_units("train")
SciERC.load_units("dev")
SciERC.load_units("test")

SoMeSci

Example
from ie_datasets import SoMeSci
SoMeSci.load_schema()
for group in ("Creation_sentences", "PLoS_methods", "PLoS_sentences", "Pubmed_fulltext"):
    SoMeSci.load_units(group=group, split="train")
    SoMeSci.load_units(group=group, split="devel")
    SoMeSci.load_units(group=group, split="test")

Re-DocRED

Example
from ie_datasets import ReDocRED
ReDocRED.load_schema()
ReDocRED.load_units("train")
ReDocRED.load_units("validation")
ReDocRED.load_units("test")

TPLinker/NYT

Example
from ie_datasets import TPLinkerNYT
TPLinkerNYT.load_schema()
TPLinkerNYT.load_units("train")
TPLinkerNYT.load_units("valid")
TPLinkerNYT.load_units("test")

TPLinker/WebNLG

Example
from ie_datasets import TPLinkerWebNLG
TPLinkerWebNLG.load_schema()
TPLinkerWebNLG.load_units("train")
TPLinkerWebNLG.load_units("valid")
TPLinkerWebNLG.load_units("test")

WikiEvents

Example
from ie_datasets import WikiEvents
WikiEvents.load_ontology()
WikiEvents.load_units("train")
WikiEvents.load_units("dev")
WikiEvents.load_units("test")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ie_datasets-0.0.4.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ie_datasets-0.0.4-py3-none-any.whl (49.9 kB view details)

Uploaded Python 3

File details

Details for the file ie_datasets-0.0.4.tar.gz.

File metadata

  • Download URL: ie_datasets-0.0.4.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ie_datasets-0.0.4.tar.gz
Algorithm Hash digest
SHA256 f80ce129ca5711fa6799e32108f97cb14815588ee67b1fedad6f01c13eba807e
MD5 c43b5d2ffb9bc28ad6218a56c9cef3cc
BLAKE2b-256 e5b2a04ab5b7471a3014ecb7e065f18e57a43fe6d1159da951cb5cc2b3335af3

See more details on using hashes here.

File details

Details for the file ie_datasets-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: ie_datasets-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 49.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ie_datasets-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 02c1ba600fc7746ab095295d2daefccbea4eeaaa52f68b5a05b8b6a0f123ab1d
MD5 786fc6cf24b2c55cffe0bb141c71184f
BLAKE2b-256 4d1e265d772ef3896e0b7f7f06fe4450d941b161fc6436fcbe85f8aedcec9ff4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page