Extension for nlp-pie package

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Programming Language
- Python
Topic
- Text Processing :: Linguistic

Project description

Pie Extended

PyPI

Warning: This software is only compatible with up to Python 3.7 for the moment.

Extension for pie to include taggers with their models and pre/postprocessors.

Pie is a wonderful tool to train models. And most of the time, it will be enough. What pie_extended is proposing here is to provide you with the necessary tools to share your models with customized pre- and post-processing.

The current system provide an easier access to adding customized:

normalization of your text,
sentence tokenization,
word tokenization,
disambiguation,
output formatting

Cite as

@software{thibault_clerice_2020_3883590,
  author       = {Clérice, Thibault},
  title        = {Pie Extended, an extension for Pie with pre-processing and post-processing},
  month        = jun,
  year         = 2020,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3883589},
  url          = {https://doi.org/10.5281/zenodo.3883589}
}

Current supported languages

Classical Latin (Model: lasla)
Ancient Greek (Model: grc)
Old French (Model: fro)
Middle French (Model: frm)
Early Modern French (Model: freem)
Classical French (Model: fr)
Old Dutch (Model: dum)
Old Occitan (Model: old_occ)

If you trained models and want some help sharing them with Pie Extended, open an issue :)

Install

To install, simply do pip install pie-extended. Then, look at all available models.

WARNING: if you don't have a GPU or CUDA

Please, in case of doubt, run pip install pie-extended --extra-index-url https://download.pytorch.org/whl/cpu

Run on terminal

But on top of that, it provides a quick and easy way to use others models ! For example, in a shell :

pie-extended download lasla
pie-extended install-addons lasla
pie-extended tag lasla your_file.txt

will give you access to all you need !

Python API

You can run the lemmatizer in your own scripts and retrieve token annotations as dictionaries:

from typing import List
from pie_extended.cli.utils import get_tagger, get_model, download

# In case you need to download
do_download = False
if do_download:
    for dl in download("lasla"):
        x = 1

# model_path allows you to override the model loaded by another .tar
model_name = "lasla"
tagger = get_tagger(model_name, batch_size=256, device="cpu", model_path=None)

sentences: List[str] = ["Lorem ipsum dolor sit amet, consectetur adipiscing elit. "]
# Get the main object from the model (: data iterator + postprocesor
from pie_extended.models.lasla.imports import get_iterator_and_processor
for sentence_group in sentences:
    iterator, processor = get_iterator_and_processor()
    print(tagger.tag_str(sentence_group, iterator=iterator, processor=processor) )

will result in

[{'form': 'lorem', 'lemma': 'lor', 'POS': 'NOMcom', 'morph': 'Case=Acc|Numb=Sing', 'treated': 'lorem'},
 {'form': 'ipsum', 'lemma': 'ipse', 'POS': 'PROdem', 'morph': 'Case=Acc|Numb=Sing', 'treated': 'ipsum'},
 {'form': 'dolor', 'lemma': 'dolor', 'POS': 'NOMcom', 'morph': 'Case=Nom|Numb=Sing', 'treated': 'dolor'},
 {'form': 'sit', 'lemma': 'sum1', 'POS': 'VER', 'morph': 'Numb=Sing|Mood=Sub|Tense=Pres|Voice=Act|Person=3',
  'treated': 'sit'},
 {'form': 'amet', 'lemma': 'amo', 'POS': 'VER', 'morph': 'Numb=Sing|Mood=Sub|Tense=Pres|Voice=Act|Person=3',
  'treated': 'amet'}, {'form': ',', 'lemma': ',', 'pos': 'PUNC', 'morph': 'MORPH=empty', 'treated': ','},
 {'form': 'consectetur', 'lemma': 'consector2', 'POS': 'VER',
  'morph': 'Numb=Sing|Mood=Sub|Tense=Pres|Voice=Dep|Person=3', 'treated': 'consectetur'},
 {'form': 'adipiscing', 'lemma': 'adipiscor', 'POS': 'VER', 'morph': 'Tense=Pres|Voice=Dep', 'treated': 'adipiscing'},
 {'form': 'elit', 'lemma': 'elio', 'POS': 'VER', 'morph': 'Numb=Sing|Mood=Ind|Tense=Pres|Voice=Act|Person=3',
  'treated': 'elit'}, {'form': '.', 'lemma': '.', 'pos': 'PUNC', 'morph': 'MORPH=empty', 'treated': '.'}]

Add a model

Create a package in ./pie_extended/models/. Exemple: foo.
Add the name of the package in ./pie_extended/models/__init__.py in the variable modules.
In the module pie_extended.models.foo, we should find the following variable:
- Models : a string with filenames and tasks for Pie.
- DESC: a METADATA object that bears information about the model
- DOWNLOADS: A list of file to download.

from pie_extended.utils import Metadata, File, get_path

DESC = Metadata(
    "Foo"
    "language",
    ["Author 1", "Author 2"],
    "A readable description",
    "A link to more information"
)

DOWNLOADS = [
    File("/a/link/to/a/file", "local_name_of_the_file.tar")
]


Models = "<{},task1,task2><{},lemma,pos>".format(
    get_path("foo", "local_name_of_the_file.tar")
)

In the module pie_extended.models.foo.imports, we should find the following content:
1. get_iterator_and_processor: a function that returns a DataIterator and a Processor
2. (optionally) addons: a function that installs add-ons
3. (optionally) Disambiguator: a disambiguator instance (or an object creator that returns one)

Check for a simple example in pie_extended.models.fro.imports and a more complex one in pie_extended.models.lasla.imports

Install development version (⚠ for development only)

Clone the repository, create an environment, and then

python setup.py develop

Warning

This is an extremely early build, subject to change here and there. But it is functional !

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Programming Language
- Python
Topic
- Text Processing :: Linguistic

Release history Release notifications | RSS feed

This version

0.1.5

Jun 23, 2026

0.1.4

Jun 22, 2026

0.1.3

May 30, 2024

0.1.2

Apr 28, 2023

0.1.1

Apr 27, 2023

0.1.0

Apr 27, 2023

0.0.42

Feb 23, 2023

0.0.41

Dec 5, 2022

0.0.40

May 10, 2022

0.0.39

Jun 4, 2021

0.0.38

May 20, 2021

0.0.37

Apr 13, 2021

0.0.36

Apr 13, 2021

0.0.35

Apr 12, 2021

0.0.34

Apr 3, 2021

0.0.33

Apr 3, 2021

0.0.32

Mar 22, 2021

0.0.31

Feb 17, 2021

0.0.30

Feb 17, 2021

0.0.29

Feb 11, 2021

0.0.28

Feb 11, 2021

0.0.27

Feb 3, 2021

0.0.26

Jan 14, 2021

0.0.25

Jan 13, 2021

0.0.24

Dec 14, 2020

0.0.23

Dec 4, 2020

0.0.22

Dec 2, 2020

0.0.21

Sep 24, 2020

0.0.20

Sep 22, 2020

0.0.19

Sep 18, 2020

0.0.18

Sep 8, 2020

0.0.17

Jul 26, 2020

0.0.16

Jun 22, 2020

0.0.15

Jun 22, 2020

0.0.13

May 6, 2020

0.0.12

May 5, 2020

0.0.11

Apr 28, 2020

0.0.10

Apr 24, 2020

0.0.9

Apr 24, 2020

0.0.8

Mar 4, 2020

0.0.7

Mar 4, 2020

0.0.6

Feb 27, 2020

0.0.5

Feb 25, 2020

0.0.4

Feb 25, 2020

0.0.3

Feb 10, 2020

0.0.2

Dec 21, 2019

0.0.1

Dec 20, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pie_extended-0.1.5.tar.gz (47.9 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pie_extended-0.1.5-py3-none-any.whl (66.5 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file pie_extended-0.1.5.tar.gz.

File metadata

Download URL: pie_extended-0.1.5.tar.gz
Upload date: Jun 23, 2026
Size: 47.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for pie_extended-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`c1962e95ee892048f32fd6a9b868451215d0db9668ba23c750dc556bf74f61f7`
MD5	`d28d86557209182805d0318d4561a1d1`
BLAKE2b-256	`5e796ded46c2c0c18eeca76e5360d7dc27aae37b75d6958134effc8118894962`

See more details on using hashes here.

File details

Details for the file pie_extended-0.1.5-py3-none-any.whl.

File metadata

Download URL: pie_extended-0.1.5-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 66.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for pie_extended-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`20e4e5d8d2bbee453112704ae945eebf03caba5c54bd14d5e10dbe027a402785`
MD5	`37facb254207d89c39d46450022bc3e0`
BLAKE2b-256	`785a091a09f7f2d44c043204f65fb6375a1e0e9fd313cad12e670c1da40210b9`

See more details on using hashes here.

pie-extended 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Pie Extended

Cite as

Current supported languages

Install

WARNING: if you don't have a GPU or CUDA

Run on terminal

Python API

Add a model

Install development version (⚠ for development only)

Warning

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes