CKIP NLP Wrappers
Project description
CKIP NLP Wrappers (Word Segmentation and Parser)
Introduction
Git
PyPI
Requirements
Python 2.7+, 3.5+
Cython 0.29+
Boost C++ Libraries 1.54.0
CKIP Word Segmentation Linux version
CKIP Parser Linux version
Installation
Step 1: Setup CKIPWS environment
Denote <ckipws-linux-root> as the root path of CKIPWS Linux Version. Add below command to ~/.bashrc
export LD_LIBRARY_PATH=<ckipws-linux-root>/lib:$LD_LIBRARY_PATH
export CKIPWS_DATA2=<ckipws-linux-root>/Data2
Step 2: Setup CKIP-Parser environment
Denote <ckipparser-linux-root> as the root path of CKIP-Parser Linux Version. Add below command to ~/.bashrc
export LD_LIBRARY_PATH=<ckipparser-linux-root>/lib:$LD_LIBRARY_PATH
export CKIPPARSER_RULE=<ckipparser-linux-root>/Rule
export CKIPPARSER_RDB=<ckipparser-linux-root>/RDB
Step 3: Install Using Pip
LIBRARY_PATH=<ckipws-linux-root>/lib:<ckipparser-linux-root>/lib:$LIBRARY_PATH pip install pyckip
API
CkipWS
class ckipws.CkipWS(logger=False, inifile=None, data2dir=None, lexfile=None, new_style_format=False, show_category=True)
The CKIP word segmentation driver.
- logger (bool)
enable logger.
- inifile (str)
the path to the INI file.
- data2dir (str)
the path to the folder “Data2/” (default is “$CKIPWS_DATA2/”).
- lexfile (str)
the path to the user-defined lexicon file.
- new_style_format (bool)
split sentences by newline characters (”n”) rather than punctuations.
- show_category (bool)
show part-of-speech tags.
def ckipws.CkipWS.__call__(text, unicode=False)
Segment a sentence.
- text (str)
the input sentence.
- unicode (bool)
use Unicode for of input/output encoding; otherwise use system encoding.
- return value (str)
the output sentence.
def ckipws.CkipWS.apply_list(text, unicode=False)
Segment a list of sentence.
- ilist (str)
the list of input sentences (str).
- unicode (bool)
use Unicode for of input/output encoding; otherwise use system encoding.
- return value (str)
the list of output sentences (str).
def ckipws.CkipWS.apply_file(ifile, ofile, uwfile)
Segment a file.
- ifile (str)
the input file.
- ofile (str)
the output file (will be overwritten).
- uwfile (str)
the unknown word file (will be overwritten).
CkipParser
class ckipparser.CkipParser(logger=False, inifile=None, wsinifile=None, data2dir=None, ruledir=None, rdbdir=None, do_ws=True, do_parse=True, do_role=True, lexfile=None, new_style_format=False, show_category=True)
The CKIP parser driver.
- logger (bool)
enable logger (logger is not support in parser).
- inifile (str)
the path to the INI file.
- wsinifile (str)
the path to the INI file.
- data2dir (str)
the path to the folder “Data2/” (default is “$CKIPWS_DATA2/”).
- ruledir (str)
the path to the folder “Rule/” (default is “$CKIPPARSER_RULE/”).
- rdbdir (str)
the path to the folder “RDB/” (default is “$CKIPPARSER_RDB/”).
- do_ws (bool)
do word-segmentation.
- do_parse (bool)
do parsing.
- do_role (bool)
do role.
- lexfile (str)
the path to the user-defined lexicon file.
- new_style_format (bool)
split sentences by newline characters (”n”) rather than punctuations.
- show_category (bool)
show part-of-speech tags.
def ckipparser.CkipParser.__call__(text, unicode=False)
Segment a sentence.
- text (str)
the input sentence.
- unicode (bool)
use Unicode for of input/output encoding; otherwise use system encoding.
- return value (str)
the output sentence.
def ckipparser.CkipParser.apply_list(text, unicode=False)
Segment a list of sentence.
- ilist (str)
the list of input sentences (str).
- unicode (bool)
use Unicode for of input/output encoding; otherwise use system encoding.
- return value (str)
the list of output sentences (str).
def ckipparser.CkipParser.apply_file(ifile, ofile)
Segment a file.
- ifile (str)
the input file.
- ofile (str)
the output file (will be overwritten).
FAQ
I don’t have CKIPWS/CKIP-Parser. What should I do?
Append --install-option='--no-ws'
or --install-option='--no-parser'
after the pip install
command to disable CKIPWS or CKIP-Parser.
# Disable CKIPWS support
pip install pyckip --install-option='--no-ws'
# Disable CKIP-Parser support
pip install pyckip --install-option='--no-parser'
The CKIPWS throws “what(): locale::facet::_S_create_c_locale name not valid”. What should I do?
apt-get install locales-all
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.