spacy wrapper for Trankit, a Transformer-based multilingual neural dependency parser with tokenization and NER
Project description
spaCy + Trankit
This package wraps the Trankit library, so you can use trankit models in a spaCy pipeline.
Using this wrapper, you'll be able to use the following annotations, computed by
your pretrained trankit
pipeline/model:
- Statistical tokenization (reflected in the
Doc
and its tokens) - Lemmatization (
token.lemma
andtoken.lemma_
) - Part-of-speech tagging (
token.tag
,token.tag_
,token.pos
,token.pos_
) - Morphological analysis (
token.morph
) - Dependency parsing (
token.dep
,token.dep_
,token.head
) - Named entity recognition (
doc.ents
,token.ent_type
,token.ent_type_
,token.ent_iob
,token.ent_iob_
) - Sentence segmentation (
doc.sents
)
️️️⌛️ Installation
As of v0.1.0 spacy-trankit
is only compatible with spaCy v3.x. To install
the most recent version:
pip install git+https://github.com/imvladikon/spacy-trankit
📖 Usage & Examples
Load pre-trained trankit
model into a spaCy pipeline:
import spacy_trankit
# Initialize the pipeline
nlp = spacy_trankit.load("en")
doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
print(token.text, token.lemma_, token.pos_, token.dep_, token.ent_type_)
print(doc.ents)
Load it from the path:
import spacy_trankit
# Initialize the pipeline
nlp = spacy_trankit.load_from_path(name="en", path="./cache")
doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
print(token.text, token.lemma_, token.pos_, token.dep_, token.ent_type_)
print(doc.ents)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file spacy_trankit-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: spacy_trankit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf3b9bc9fb677b80ab345f9c85c82fb15f957be60b3e7cfcfbb64b394b4f58e0 |
|
MD5 | 40422d160bee06fb5c1e7c8eaf39a418 |
|
BLAKE2b-256 | f94c46577a37320c71511bcd2075648cfcc36ca75734b7f61bedb7fe23e3e85a |