Skip to main content

Lightweight cross-lingual coreference resolution with spaCy using ONNX Runtime inference of transformer models.

Project description

spacy-coref

Lightweight, fast co-reference resolution using a distilled version of AllenNLP's coreference model (exported to ONNX).

✨ Features

  • 🧠 Cross-lingual coreference resolution
  • 🪶 Lightweight model based on AllenNLP’s coref modeling
  • ⚡ Fast inference via ONNX
  • 🔌 Easy integration with spaCy

📦 Installation

$ pip install spacy-coref

🚀 Quickstart

Usage as a standalone component

from spacy_coref import CoreferenceResolver, decode_clusters

resolver = CoreferenceResolver.from_pretrained("talmago/allennlp-coref-onnx-mMiniLMv2-L12-H384-distilled-from-XLMR-Large")

sentences = [
    ["Barack", "Obama", "was", "the", "44th", "President", "of", "the", "United", "States", "."],
    ["He", "was", "born", "in", "Hawaii", "."]
]

pred = resolver(sentences)

print(decode_clusters(sentences, pred["clusters"][0]))

# Output is:
# [['Barack Obama', 'He']]

Usage with spaCy

import spacy

from spacy_coref import create_coref_minilm_component

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("coref_minilm")

doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
print(doc._.coref_clusters[0])
print(doc._.cluster_heads)
print(doc._.resolved_text)

# Output is:
# [Barack Obama, He]
# {'Barack Obama': Barack Obama}
# Barack Obama was born in Hawaii. Barack Obama was elected president in 2008.

🛠️ Development

Set up virtualenv

$ make env

Set PYTHONPATH

$ export PYTHONPATH=$PYTHONPATH:/Users/talmago/git/spacy-coref/src

Code formatting

$ make format

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_coref-0.1.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spacy_coref-0.1.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file spacy_coref-0.1.0.tar.gz.

File metadata

  • Download URL: spacy_coref-0.1.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.13 Linux/6.11.0-1018-azure

File hashes

Hashes for spacy_coref-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ccbe4e4c262a06f07c03039887f8b359b2e01b14f28f3ce783bec07c30ae2596
MD5 d795b7ebc85723d81aebb0d10b8d72b0
BLAKE2b-256 d521830f1cda1bca1dbff3e1250369783bc5d05530b8febc252daaac7704c6b7

See more details on using hashes here.

File details

Details for the file spacy_coref-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spacy_coref-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.13 Linux/6.11.0-1018-azure

File hashes

Hashes for spacy_coref-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ac9e24b1c10d1b3ae42c17f2e8cb30c0fb9fda3ff7747d78b76e3f83a41c1606
MD5 0aaf8d707998596045904a242a286abb
BLAKE2b-256 e7188bb898aa4f27ae1518779e2fb3c2cb599d12b4721d837ded357033917c71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page