Python bindings to the LIMA linguistic analyzer
Reason this release was yanked:
Don't need different files for cp*. Superseded by 0.4.1.
Project description
LIMA python bindings
Introducing LIMA
LIMA is a multilingual linguistic analyzer developed by the CEA LIST, LASTI laboratory (French acronym for Text and Image Semantic Analysis Laboratory). LIMA is Free Software, available under the MIT license.
LIMA has state of the art performance for more than 60 languages thanks to its recent deep learning (neural network) based modules. But it includes also a very powerful rules based mechanism called ModEx allowing to quickly extract information (entities, relations, events…) in new domains where annotated data does not exist.
For more information, detailed installation instructions and documentation, please refer to the LIMA Wiki.
Installation
LIMA python bindings are currently available for python 3.8 only. Install with:
$ pip install --upgrade pip # IMPORTANT: LIMA needs a recent pip
$ pip install aymara
You can use it like that in English (eng) or French (fre) but it is preferable to use deep-learning based models. To install them, use the lima_models.py
script:
$ lima_models.py -h
usage: lima_models.py [-h] [-i] [-l LANG] [-d DEST] [-s SELECT] [-f] [-L]
optional arguments:
-h, --help show this help message and exit
-i, --info print list of available languages and exit
-l LANG, --lang LANG install model for the given language name or language code (example: 'english'
or 'eng')
-d DEST, --dest DEST destination directory
-s SELECT, --select SELECT
select particular models to install: tokenizer, morphosyntax, lemmatizer
(comma-separated list)
-f, --force force reinstallation of existing files
-L, --list list installed models
For example:
$ lima_models.py -l eng
Running
$ python
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import aymara.lima
>>> l = aymara.lima.Lima("ud-eng")
>>> r = l.analyzeText("The author wrote a novel.", lang="ud-eng")
>>> print(r.conll())
# sent_id = 1
# text = The author wrote a novel.
1 The the DET _ Definite=Def|PronType=Art 2 det _ Len=3|Pos=1
2 author author NOUN _ Number=Sing 3 nsubj _ Len=6|Pos=5
3 wrote write VERB _ Mood=Ind|Tense=Past|VerbForm=Fin 0 root _ Len=5|Pos=12
4 a a DET _ Definite=Ind|PronType=Art 5 det _ Len=1|Pos=18
5 novel novel NOUN _ Number=Sing 3 obj _ Len=5|Pos=20|SpaceAfter=No
6 . . PUNCT _ _ 3 punct _ Len=1|Pos=25
>>>
Note that some error messages could be displayed during the Lima object instantiation. If you get a valid object, you can ignore them. Most of them are debug messages that will be removed in a later version.
You can replace the language (ud-eng
) used by eng
to use the legacy pipeline. This is the same for ud-fra
and fre
. Note that legacy pipelines do not use the Universal Dependencies tagset, but a proprietary one.
Configuration and customization
To configure finely LIMA for your needs, follow the same instructions as for the native C++ tools, available here: [[https://github.com/aymara/lima/wiki/LIMA-User-Manual]].
LIMA poetry package build instructions
Build, install and deploy this Pypi package using poetry
$ pip install poetry
$ poetry build
$ poetry install
$ poetry publish
More information: https://python-poetry.org/
PySide2 LIMA python bindings build instructions (in progress)
First install pyside
# Install PySide2 and shiboken2 from source as binary installs are broken
# Done in /home/gael/Logiciels/
sudo apt install qtbase5-private-dev qtdeclarative5-private-dev
git clone https://code.qt.io/cgit/pyside/pyside-setup.git
cd pyside-setup
python setup.py install --cmake=/usr/bin/cmake --build-type=all
# fail with rcc execution error
cp /usr/bin/rcc /home/gael/Logiciels/pyside-setup/lima3_install/py3.8-qt5.15.3-64bit-release/bin/rcc
python setup.py install --cmake=/usr/bin/cmake --build-type=all
Building and deploying the wheel
docker build . -t lima-python:latest
docker create -ti --name dummy lima-python:latest bash
docker cp dummy:/lima-python/wheelhouse/aymara-0.3.4-cp38-cp38-manylinux_2_24_x86_64.whl .
docker rm -f dummy
scp aymara-0.3.4-cp38-cp38-manylinux_2_24_x86_64.whl gdechalendar@combava:/data/HTTP_FileServer/data/lima
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for aymara-0.4.0-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d63ebd3a94040182c91485329dfe9b99950723dfabf12401798e1b7b3f895030 |
|
MD5 | 339af50bb0ca21844bc82b1b8181c4ac |
|
BLAKE2b-256 | 66bbf4c654d40972947e5505b97f7bde1d882fb18fd44b27a7c65531d12070e9 |
Hashes for aymara-0.4.0-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 955eccb5ff551ba47c9dfd5f0df3e7a2f3ad403a31b020ee8c5835d9db9bfe8e |
|
MD5 | 944eed71a4865ef71a54b36143c43976 |
|
BLAKE2b-256 | 597aa47a302244a607ee7980fe8c1eec3de4afd3356781229da6c2a848bf5625 |
Hashes for aymara-0.4.0-cp37-cp37m-manylinux_2_24_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c996923def343971f578e46172171a2a6bdd583b231e5078151b0361af9d661 |
|
MD5 | ff34b432afc49484c2935b739b745e9e |
|
BLAKE2b-256 | 1cb4e2712cb724a0e2716decc77d40d6868e4b589cf966b2ac198fc6b845a0f1 |