Multilingual POS-tagger and Dependency-parser
Project description
MultiCOMBO
Multilingual POS-Tagger and Dependency-Parser with COMBO-pytorch and spaCy
Basic usage
>>> import multicombo
>>> nlp=multicombo.load()
>>> doc=nlp('Who plays "La vie en rose"?')
>>> print(multicombo.to_conllu(doc))
# text = Who plays "La vie en rose"?
1 Who _ PRON _ PronType=Int 2 nsubj _ Translit=who
2 plays _ VERB _ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 0 root _ _
3 " _ PUNCT _ _ 5 punct _ SpaceAfter=No
4 La _ DET _ Definite=Def|Gender=Fem|Number=Sing|PronType=Art 5 det _ Translit=la
5 vie _ NOUN _ Gender=Fem|Number=Sing 2 obj _ _
6 en _ ADP _ _ 7 case _ _
7 rose _ NOUN _ Number=Sing 5 nmod _ SpaceAfter=No
8 " _ PUNCT _ _ 5 punct _ SpaceAfter=No
9 ? _ PUNCT _ _ 2 punct _ SpaceAfter=No
>>> import deplacy
>>> deplacy.render(doc)
Who PRON <════════════╗ nsubj
plays VERB ═══════════╗═╝═╗ ROOT
" PUNCT <══════╗ ║ ║ punct
La DET <════╗ ║ ║ ║ det
vie NOUN ═══╗═╝═╝═╗<╝ ║ obj
en ADP <╗ ║ ║ ║ case
rose NOUN ═╝<╝ ║ ║ nmod
" PUNCT <════════╝ ║ punct
? PUNCT <══════════════╝ punct
>>> deplacy.serve(doc)
http://127.0.0.1:5000
multicombo.load(lang="xx") loads spaCy Language pipeline with bert-base-multilingual-cased and spacy.lang.xx.MultiLanguage tokenizer. Other language specific tokenizers can be loaded with the option lang, while several languages require additional packages:
lang="ja"Japanese requires SudachiPy and SudachiDict-core.lang="th"Thai requires PyThaiNLP.lang="vi"Vietnamese requires pyvi.
Installation for Linux
pip3 install allennlp@git+https://github.com/allenai/allennlp
pip3 install 'transformers<4.31'
pip3 install multicombo
Installation for Cygwin64
Make sure to get python37-devel python37-pip python37-cython python37-numpy python37-cffi gcc-g++ mingw64-x86_64-gcc-g++ gcc-fortran git curl make cmake libopenblas liblapack-devel libhdf5-devel libfreetype-devel libuv-devel packages, and then:
curl -L https://raw.githubusercontent.com/KoichiYasuoka/UniDic-COMBO/master/cygwin64.sh | sh
pip3.7 install multicombo
Installation for Jupyter Notebook (Google Colaboratory)
Try notebook.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file multicombo-0.8.7-py3-none-any.whl.
File metadata
- Download URL: multicombo-0.8.7-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
642804c3543b6aa76576ec56f2d2babf1aa7b5eff7a75e35ede6c926f6b2b10b
|
|
| MD5 |
97e893886d901119126adb539d180417
|
|
| BLAKE2b-256 |
1b8182c4e8c4118fa421f0e8260b23bdf9a3d3430e1597b3413749bea2297e35
|