CKIP CoreNLP Wrappers
Project description
Introduction
Git
PyPI
Documentation
Requirements
CkipWs (Optional)
CKIP Word Segmentation Linux version (20190524+)
CkipParser (Optional)
CKIP Parser Linux version (20190506+)
Boost C++ Libraries 1.54.0
Installation
Denote <ckipws-linux-root> as the root path of CKIPWS Linux Version, and <ckipparser-linux-root> as the root path of CKIP-Parser Linux Version.
Step 1: Setup CKIPWS & CKIP-Parser environment
Add below command to ~/.bashrc:
export LD_LIBRARY_PATH=<ckipws-linux-root>/lib:<ckipparser-linux-root>/lib:$LD_LIBRARY_PATH
Step 2: Install Using Pip
pip install ckipnlp \
--install-option='--ws' \
--install-option='--ws-dir=<ckipws-linux-root>' \
--install-option='--parser' \
--install-option='--parser-dir=<ckipparser-linux-root>'
Ignore ws/parser options if one doesn’t have CKIPWS/CKIP-Parser.
Installation Options
Option |
Detail |
Default Value |
|---|---|---|
--[no-]ws |
Enable/disable CKIPWS. |
False |
--[no-]parser |
Enable/disable CKIP-Parser. |
False |
--ws-dir=<ws-dir> |
CKIPWS root directory. |
|
--ws-lib-dir=<ws-lib-dir> |
CKIPWS libraries directory |
<ws-dir>/lib |
--ws-share-dir=<ws-share-dir> |
CKIPWS share directory |
<ws-dir> |
--parser-dir=<parser-dir> |
CKIP-Parser root directory. |
|
--parser-lib-dir=<parser-lib-dir> |
CKIP-Parser libraries directory |
<parser-dir>/lib |
--parser-share-dir=<parser-share-dir> |
CKIP-Parser share directory |
<parser-dir> |
--data2-dir=<data2-dir> |
“Data2” directory |
<ws-share-dir>/Data2 |
--rule-dir=<rule-dir> |
“Rule” directory |
<parser-share-dir>/Rule |
--rdb-dir=<rdb-dir> |
“RDB” directory |
<parser-share-dir>/RDB |
Usage
See http://ckipnlp.readthedocs.io/ for API details.
CKIPWS
import ckipnlp.ws
print(ckipnlp.__name__, ckipnlp.__version__)
ws = ckipnlp.ws.CkipWs(logger=False)
print(ws('中文字喔'))
for l in ws.apply_list(['中文字喔', '啊哈哈哈']): print(l)
ws.apply_file(ifile='sample/sample.txt', ofile='output/sample.tag', uwfile='output/sample.uw')
with open('output/sample.tag') as fin:
print(fin.read())
with open('output/sample.uw') as fin:
print(fin.read())
CKIP-Parser
import ckipnlp.parser
print(ckipnlp.__name__, ckipnlp.__version__)
ps = ckipnlp.parser.CkipParser(logger=False)
print(ps('中文字喔'))
for l in ps.apply_list(['中文字喔', '啊哈哈哈']): print(l)
ps = ckipnlp.parser.CkipParser(logger=False)
print(ps('中文字喔'))
for l in ps.apply_list(['中文字喔', '啊哈哈哈']): print(l)
ps.apply_file(ifile='sample/sample.txt', ofile='output/sample.tree')
with open('output/sample.tree') as fin:
print(fin.read())
Utilities
import ckipnlp
print(ckipnlp.__name__, ckipnlp.__version__)
from ckipnlp.util.ws import *
from ckipnlp.util.parser import *
# Format CkipWs output
ws_text = ['中文字(Na) 喔(T)', '啊哈(I) 哈哈(D)']
for text in ws_text: print(ckipnlp.util.ws.WsSentence.from_text(text))
for text in ws_text: print(repr(ckipnlp.util.ws.WsSentence.from_text(text)))
# Show CkipParser output as tree
tree_text = 'S(theme:NP(property:N‧的(head:Nhaa:我|Head:DE:的)|Head:Nad(DUMMY1:Nab:早餐|Head:Caa:和|DUMMY2:Naa:午餐))|quantity:Dab:都|Head:VC31:吃完|aspect:Di:了)'
tree = ParserTree.from_text(tree_text)
tree.show()
# Get dummies of node 5
for node in tree.get_dummies(5): print(node)
# Get heads of node 1
for node in tree.get_heads(1): print(node)
# Get relations
for r in tree.get_relations(0): print(r)
FAQ
The CKIPWS throws “what(): locale::facet::_S_create_c_locale name not valid”. What should I do?
apt-get install locales-all
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ckipnlp-0.5.1.tar.gz.
File metadata
- Download URL: ckipnlp-0.5.1.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.9.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53aaf206adabf97be3f17eb628a0a3dd293fa679130ca7655387ffa4d6e4c5db
|
|
| MD5 |
04a05e3aff9a6f7ad7e794b4cfe21a38
|
|
| BLAKE2b-256 |
91f95e49404341d0d96742341de113f9e1aaad48721240ab2d47a92abb747ed3
|