Skip to main content

spacy wrapper for Trankit, a Transformer-based multilingual neural dependency parser with tokenization and NER

Project description

spaCy + Trankit

This package wraps the Trankit library, so you can use trankit models in a spaCy pipeline.

GitHub Code style: black

Using this wrapper, you'll be able to use the following annotations, computed by your pretrained trankit pipeline/model:

  • Statistical tokenization (reflected in the Doc and its tokens)
  • Lemmatization (token.lemma and token.lemma_)
  • Part-of-speech tagging (token.tag, token.tag_, token.pos, token.pos_)
  • Morphological analysis (token.morph)
  • Dependency parsing (token.dep, token.dep_, token.head)
  • Named entity recognition (doc.ents, token.ent_type, token.ent_type_, token.ent_iob, token.ent_iob_)
  • Sentence segmentation (doc.sents)

️️️⌛️ Installation

As of v0.1.0 spacy-trankit is only compatible with spaCy v3.x. To install the most recent version:

pip install git+https://github.com/imvladikon/spacy-trankit

📖 Usage & Examples

Load pre-trained trankit model into a spaCy pipeline:

import spacy_trankit

# Initialize the pipeline
nlp = spacy_trankit.load("en")

doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
    print(token.text, token.lemma_, token.pos_, token.dep_, token.ent_type_)
print(doc.ents)

Load it from the path:

import spacy_trankit

# Initialize the pipeline
nlp = spacy_trankit.load_from_path(name="en", path="./cache") 

doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
    print(token.text, token.lemma_, token.pos_, token.dep_, token.ent_type_)
print(doc.ents)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

spacy_trankit-0.1.0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file spacy_trankit-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for spacy_trankit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bf3b9bc9fb677b80ab345f9c85c82fb15f957be60b3e7cfcfbb64b394b4f58e0
MD5 40422d160bee06fb5c1e7c8eaf39a418
BLAKE2b-256 f94c46577a37320c71511bcd2075648cfcc36ca75734b7f61bedb7fe23e3e85a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page