pytorch-ie

State-of-the-art Information Extraction in PyTorch

These details have not been verified by PyPI

Project links

Project description

Read the documentation at https://pytorch-ie.readthedocs.io/

🚀️ Quickstart

$ pip install pytorch-ie

⚡️ Examples

Note: Setting num_workers=0 in the pipeline is only necessary when running an example in an interactive python session. The reason is that multiprocessing doesn’t play well with the interactive python interpreter, see here for details.

Span-classification-based Named Entity Recognition

from dataclasses import dataclass

from pytorch_ie import AnnotationList, LabeledSpan, Pipeline, TextDocument, annotation_field
from pytorch_ie.models import TransformerSpanClassificationModel
from pytorch_ie.taskmodules import TransformerSpanClassificationTaskModule


@dataclass
class ExampleDocument(TextDocument):
    entities: AnnotationList[LabeledSpan] = annotation_field(target="text")


model_name_or_path = "pie/example-ner-spanclf-conll03"
ner_taskmodule = TransformerSpanClassificationTaskModule.from_pretrained(model_name_or_path)
ner_model = TransformerSpanClassificationModel.from_pretrained(model_name_or_path)

ner_pipeline = Pipeline(model=ner_model, taskmodule=ner_taskmodule, device=-1, num_workers=0)

document = ExampleDocument(
    "“Making a super tasty alt-chicken wing is only half of it,” said Po Bronson, general partner at SOSV and managing director of IndieBio."
)

ner_pipeline(document, predict_field="entities")

for entity in document.entities.predictions:
    print(f"{entity} -> {entity.label}")

# Result:
# IndieBio -> ORG
# Po Bronson -> PER
# SOSV -> ORG

Text-classification-based Relation Extraction

from dataclasses import dataclass

from pytorch_ie import AnnotationList, BinaryRelation, LabeledSpan, Pipeline, TextDocument, annotation_field
from pytorch_ie.models import TransformerTextClassificationModel
from pytorch_ie.taskmodules import TransformerRETextClassificationTaskModule


@dataclass
class ExampleDocument(TextDocument):
    entities: AnnotationList[LabeledSpan] = annotation_field(target="text")
    relations: AnnotationList[BinaryRelation] = annotation_field(target="entities")


model_name_or_path = "pie/example-re-textclf-tacred"
re_taskmodule = TransformerRETextClassificationTaskModule.from_pretrained(model_name_or_path)
re_model = TransformerTextClassificationModel.from_pretrained(model_name_or_path)

re_pipeline = Pipeline(model=re_model, taskmodule=re_taskmodule, device=-1, num_workers=0)

document = ExampleDocument(
    "“Making a super tasty alt-chicken wing is only half of it,” said Po Bronson, general partner at SOSV and managing director of IndieBio."
)

for start, end, label in [(65, 75, "PER"), (96, 100, "ORG"), (126, 134, "ORG")]:
    document.entities.append(LabeledSpan(start=start, end=end, label=label))

re_pipeline(document, predict_field="relations", batch_size=2)

for relation in document.relations.predictions:
    print(f"({relation.head} -> {relation.tail}) -> {relation.label}")

# Result:
# (Po Bronson -> SOSV) -> per:employee_of
# (Po Bronson -> IndieBio) -> per:employee_of
# (SOSV -> Po Bronson) -> org:top_members/employees
# (IndieBio -> Po Bronson) -> org:top_members/employees

✨📚✨ Read the full documentation

Development Setup

🏅 Acknowledgements

This package is based on the sourcery-ai/python-best-practices-cookiecutter and cjolowicz/cookiecutter-hypermodern-python project templates.

📃 Citation

If you want to cite the framework feel free to use this:

@misc{alt2022pytorchie,
author={Christoph Alt, Arne Binder},
title = {PyTorch-IE: State-of-the-art Information Extraction in PyTorch},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ChristophAlt/pytorch-ie}}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.31.1

Jun 25, 2024

0.31.0

May 28, 2024

0.30.3

Apr 17, 2024

0.30.2

Apr 4, 2024

0.30.1

Mar 21, 2024

0.30.0

Feb 19, 2024

0.29.8

Jan 18, 2024

0.29.7

Jan 15, 2024

0.29.6

Jan 10, 2024

0.29.5

Dec 28, 2023

0.29.4

Dec 11, 2023

0.29.3

Dec 8, 2023

0.29.2

Nov 26, 2023

0.29.1

Nov 17, 2023

0.29.0

Nov 14, 2023

0.28.1

Nov 14, 2023

0.28.0

Nov 8, 2023

0.27.0

Nov 7, 2023

0.26.0

Oct 23, 2023

0.25.1

Oct 5, 2023

0.25.0

Sep 27, 2023

0.24.3

Sep 23, 2023

0.24.2

Sep 17, 2023

0.24.1

Sep 14, 2023

0.24.0

Sep 13, 2023

0.23.0

Sep 12, 2023

0.22.1

Sep 7, 2023

0.22.0

Sep 7, 2023

0.21.0

Sep 5, 2023

0.20.0

Sep 4, 2023

0.19.0

Aug 14, 2023

0.18.1

Aug 6, 2023

0.18.0

Aug 5, 2023

0.17.1

Jul 28, 2023

0.17.0

Jul 28, 2023

0.16.0

Jul 20, 2023

0.15.1

Jun 7, 2023

0.15.0

Jun 7, 2023

0.14.0

Mar 31, 2023

0.13.0

Mar 1, 2023

0.12.0

Dec 3, 2022

0.11.0

Oct 26, 2022

0.10.0

Aug 18, 2022

0.9.0

Aug 9, 2022

0.8.0

Jul 28, 2022

0.7.0

Jul 17, 2022

0.6.1

May 6, 2022

0.6.0

May 5, 2022

0.5.0

May 5, 2022

This version

0.4.0

Apr 29, 2022

0.3.3

Apr 24, 2022

0.3.2

Apr 12, 2022

0.3.1

Mar 9, 2022

0.3.0

Mar 6, 2022

0.2.1

Mar 4, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch-ie-0.4.0.tar.gz (71.0 kB view hashes)

Uploaded Apr 29, 2022 Source

Built Distribution

pytorch_ie-0.4.0-py3-none-any.whl (98.9 kB view hashes)

Uploaded Apr 29, 2022 Python 3

Hashes for pytorch-ie-0.4.0.tar.gz

Hashes for pytorch-ie-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`d5cbbab3e18572d0dff159d72ca902f7b74afcf290e8e1119b93c3cb07244f70`
MD5	`4212e150254565d72b2c308a3b9b09e0`
BLAKE2b-256	`d907db3fb52c51405ec88ca058ccd602acc6075d32d5da45c328f399d046b7f6`

Hashes for pytorch_ie-0.4.0-py3-none-any.whl

Hashes for pytorch_ie-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`741160098d3750212091e26ff239bce98b326c680319180d50a490102b52567b`
MD5	`3cf1a682ea41f7be96f6c35bcfcda6e5`
BLAKE2b-256	`2f5eaf1efb69310b5dc33d15e6b38767103333dbf1a884fe2f4e273307d1bfff`