Skip to main content

Extension for nlp-pie package

Project description

Pie Extended

Build Status Coverage Status PyPI

Extension for pie to include taggers with their models and pre/postprocessors.

Pie is a wonderful tool to train models. And most of the time, it will be enough. What pie_extended is proposing here is to provide you with the necessary tools to share your models with customized pre- and post-processing.

The current system provide an easier access to adding customized:

  • normalization of your text,
  • sentence tokenization,
  • word tokenization,
  • disambiguation,
  • output formatting

Install

To install, simply do pip install pie-extended. Then, look at all available models.

Run on terminal

But on top of that, it provides a quick and easy way to use others models ! For example, in a shell :

pie-extended download lasla
pie-extended install-addons lasla
pie-extended tag laslsa your_file.txt

will give you access to all you need !

Python API

You can run the lemmatizer in your own scripts and retrieve token annotations as dictionaries:

from typing import List
from pie_extended.cli.sub import get_tagger, get_model, download

# In case you need to download
do_download = False
if do_download:
    for dl in download("lasla"):
        x = 1

# model_path allows you to override the model loaded by another .tar
model_name = "lasla"
tagger = get_tagger(model_name, batch_size=256, device="cpu", model_path=None)

sentences: List[str] = ["Lorem ipsum dolor sit amet, consectetur adipiscing elit. "]
# Get the main object from the model (: data iterator + postprocesor
from pie_extended.models.lasla import get_iterator_and_processor
for sentence_group in sentences:
    iterator, processor = get_iterator_and_processor()
    print(tagger.tag_str(sentence_group, iterator=iterator, processor=processor) )

will result in

[{'form': 'lorem', 'lemma': 'lor', 'POS': 'NOMcom', 'morph': 'Case=Acc|Numb=Sing', 'treated': 'lorem'},
 {'form': 'ipsum', 'lemma': 'ipse', 'POS': 'PROdem', 'morph': 'Case=Acc|Numb=Sing', 'treated': 'ipsum'},
 {'form': 'dolor', 'lemma': 'dolor', 'POS': 'NOMcom', 'morph': 'Case=Nom|Numb=Sing', 'treated': 'dolor'},
 {'form': 'sit', 'lemma': 'sum1', 'POS': 'VER', 'morph': 'Numb=Sing|Mood=Sub|Tense=Pres|Voice=Act|Person=3',
  'treated': 'sit'},
 {'form': 'amet', 'lemma': 'amo', 'POS': 'VER', 'morph': 'Numb=Sing|Mood=Sub|Tense=Pres|Voice=Act|Person=3',
  'treated': 'amet'}, {'form': ',', 'lemma': ',', 'pos': 'PUNC', 'morph': 'MORPH=empty', 'treated': ','},
 {'form': 'consectetur', 'lemma': 'consector2', 'POS': 'VER',
  'morph': 'Numb=Sing|Mood=Sub|Tense=Pres|Voice=Dep|Person=3', 'treated': 'consectetur'},
 {'form': 'adipiscing', 'lemma': 'adipiscor', 'POS': 'VER', 'morph': 'Tense=Pres|Voice=Dep', 'treated': 'adipiscing'},
 {'form': 'elit', 'lemma': 'elio', 'POS': 'VER', 'morph': 'Numb=Sing|Mood=Ind|Tense=Pres|Voice=Act|Person=3',
  'treated': 'elit'}, {'form': '.', 'lemma': '.', 'pos': 'PUNC', 'morph': 'MORPH=empty', 'treated': '.'}]

Add a model

ToDo: Documentation

Warning

This is an extremely early build, subject to change here and there. But it is functional !

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pie_extended-0.0.4.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

pie_extended-0.0.4-py2.py3-none-any.whl (46.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pie_extended-0.0.4.tar.gz.

File metadata

  • Download URL: pie_extended-0.0.4.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.23.3 CPython/3.6.9

File hashes

Hashes for pie_extended-0.0.4.tar.gz
Algorithm Hash digest
SHA256 ec6469da06b133d46093b396c947bb21bb07d331a7bf6ff22a56119021c87154
MD5 0c96098e75f1265f7a630b2d17ad2937
BLAKE2b-256 2c2f7a0136d85850a160c9608f88cacb04692a7efe0c02615d4da09e2f5c1ed0

See more details on using hashes here.

File details

Details for the file pie_extended-0.0.4-py2.py3-none-any.whl.

File metadata

  • Download URL: pie_extended-0.0.4-py2.py3-none-any.whl
  • Upload date:
  • Size: 46.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.23.3 CPython/3.6.9

File hashes

Hashes for pie_extended-0.0.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 6e56667c9b0d250df6b70cf6351cb937304aa9edebe7c8d178fc4432a178aea0
MD5 ba63a37c064d5b830a9403045ce34355
BLAKE2b-256 3cb83b4cfb02664d1908ba604ff08636ee5881c08ce872dfb9b08800fec3b2ce

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page