Dependency-parser for Thai language
Project description
spaCy-Thai
Tokenizer, POS-tagger, and dependency-parser for Thai language, working on Universal Dependencies.
Basic Usage
>>> import spacy_thai
>>> nlp=spacy_thai.load()
>>> doc=nlp("แผนกนี้กำลังเผชิญกับความท้าทายใหม่")
>>> for t in doc:
... print("\t".join([str(t.i+1),t.orth_,t.lemma_,t.pos_,t.tag_,"_",str(0 if t.head==t else t.head.i+1),t.dep_,"_","_" if t.whitespace_ else "SpaceAfter=No"]))
...
1 แผนก แผนก NOUN NCMN _ 4 nsubj _ SpaceAfter=No
2 นี้ นี้ DET DDAC _ 1 det _ SpaceAfter=No
3 กำลัง กำลัง AUX XVBM _ 4 aux _ SpaceAfter=No
4 เผชิญ เผชิญ VERB VSTA _ 0 ROOT _ SpaceAfter=No
5 กับ กับ ADP RPRE _ 6 case _ SpaceAfter=No
6 ความ ความ PART FIXN _ 4 obl _ SpaceAfter=No
7 ท้าทาย ท้าทาย VERB VACT _ 6 acl _ SpaceAfter=No
8 ใหม่ ใหม่ ADV ADVN _ 7 advmod _ SpaceAfter=No
>>> import deplacy
>>> deplacy.render(doc,WordRight=True)
nsubj ╔════════>╔═ NOUN แผนก
det ║ ╚> DET นี้
aux ║ ╔════════> AUX กำลัง
ROOT ╚═╚═╔═══════ VERB เผชิญ
case ║ ╔════> ADP กับ
obl ╚>╚═╔═══ PART ความ
acl ╚>╔═ VERB ท้าทาย
advmod ╚> ADV ใหม่
Installation for Linux
pip3 install spacy_thai --user
Installation for Cygwin
Make sure to get python37-devel
python37-pip
python37-numpy
python37-cython
gcc-g++
, and then:
pip3.7 install spacy_thai
Installation for Google Colaboratory
!pip install spacy_thai
Try notebook.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file spacy_thai-0.7.8-py3-none-any.whl
.
File metadata
- Download URL: spacy_thai-0.7.8-py3-none-any.whl
- Upload date:
- Size: 15.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3d9b20f9d02a506f638c4d54712b938cf29d078fbe3228d85f2b7f1c7c05bd2 |
|
MD5 | 314f06d49ebdbe709861fa6939504ab9 |
|
BLAKE2b-256 | d10583d46cea4cb48387f50a027a018eb3fbd31650b7b2ae07b024e7786436b8 |