Multilingual POS-tagger and Dependency-parser
Project description
MultiCOMBO
Multilingual POS-Tagger and Dependency-Parser with COMBO-pytorch and spaCy
Basic usage
>>> import multicombo
>>> nlp=multicombo.load()
>>> doc=nlp('Who plays "La vie en rose"?')
>>> print(multicombo.to_conllu(doc))
# text = Who plays "La vie en rose"?
1 Who _ PRON _ PronType=Int 2 nsubj _ Translit=who
2 plays _ VERB _ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 0 root _ _
3 " _ PUNCT _ _ 5 punct _ SpaceAfter=No
4 La _ DET _ Definite=Def|Gender=Fem|Number=Sing|PronType=Art 5 det _ Translit=la
5 vie _ NOUN _ Gender=Fem|Number=Sing 2 obj _ _
6 en _ ADP _ _ 7 case _ _
7 rose _ NOUN _ Number=Sing 5 nmod _ SpaceAfter=No
8 " _ PUNCT _ _ 5 punct _ SpaceAfter=No
9 ? _ PUNCT _ _ 2 punct _ SpaceAfter=No
>>> import deplacy
>>> deplacy.render(doc)
Who PRON <════════════╗ nsubj
plays VERB ═══════════╗═╝═╗ ROOT
" PUNCT <══════╗ ║ ║ punct
La DET <════╗ ║ ║ ║ det
vie NOUN ═══╗═╝═╝═╗<╝ ║ obj
en ADP <╗ ║ ║ ║ case
rose NOUN ═╝<╝ ║ ║ nmod
" PUNCT <════════╝ ║ punct
? PUNCT <══════════════╝ punct
>>> deplacy.serve(doc)
http://127.0.0.1:5000
multicombo.load(lang="xx")
loads spaCy Language pipeline with bert-base-multilingual-cased and spacy.lang.xx.MultiLanguage
tokenizer. Other language specific tokenizers can be loaded with the option lang
, while several languages require additional packages:
lang="ja"
Japanese requires SudachiPy and SudachiDict-core.lang="th"
Thai requires PyThaiNLP.lang="vi"
Vietnamese requires pyvi.
Installation for Linux
pip3 install multicombo --user
Installation for Cygwin64
Make sure to get python37-devel
python37-pip
python37-cython
python37-numpy
python37-cffi
gcc-g++
mingw64-x86_64-gcc-g++
gcc-fortran
git
curl
make
cmake
libopenblas
liblapack-devel
libhdf5-devel
libfreetype-devel
libuv-devel
packages, and then:
curl -L https://raw.githubusercontent.com/KoichiYasuoka/UniDic-COMBO/master/cygwin64.sh | sh
pip3.7 install multicombo
Installation for Jupyter Notebook (Google Colaboratory)
!pip install multicombo
Try notebook for Google Colaboratory.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file multicombo-0.7.6-py3-none-any.whl
.
File metadata
- Download URL: multicombo-0.7.6-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b8a8e8efd37c8b57ce8aa8055aad31b8940b4f76b2b9227681c6847034b98d7 |
|
MD5 | 5bdfc9c087d10fcc0f906758302176cc |
|
BLAKE2b-256 | 09f74cc6dbf0f46ca7b0370d6e01f222e9d2273ffd9e00443f4b32b91f32dd69 |