Skip to main content

CKIP NLP Wrappers

Project description

CKIP NLP Wrappers (Word Segmentation and Parser)

Introduction

Git

https://github.com/emfomy/pyckip

Github Release Github License Github Forks Github Stars Github Watchers

PyPI

https://pypi.org/project/pyckip

Pypi Version Pypi License Pypi Format Pypi Python Pypi Implementation Pypi Status

Author

Requirements

Installation

Step 1: Setup CKIPWS environment

Denote <ckipws-linux-root> as the root path of CKIPWS Linux Version. Add below command to ~/.bashrc

export LD_LIBRARY_PATH=<ckipws-linux-root>/lib:$LD_LIBRARY_PATH
export CKIPWS_DATA2=<ckipws-linux-root>/Data2

Step 2: Setup CKIP-Parser environment

Denote <ckipparser-linux-root> as the root path of CKIP-Parser Linux Version. Add below command to ~/.bashrc

export LD_LIBRARY_PATH=<ckipparser-linux-root>/lib:$LD_LIBRARY_PATH
export CKIPPARSER_RULE=<ckipparser-linux-root>/Rule
export CKIPPARSER_RDB=<ckipparser-linux-root>/RDB

Step 3: Install Using Pip

LIBRARY_PATH=<ckipws-linux-root>/lib:<ckipparser-linux-root>/lib:$LIBRARY_PATH pip install pyckip

API

CkipWS

class ckipws.CkipWS(logger=False, inifile=None, data2dir=None, lexfile=None, new_style_format=False, show_category=True)

The CKIP word segmentation driver.

logger (bool)

enable logger.

inifile (str)

the path to the INI file.

data2dir (str)

the path to the folder “Data2/” (default is “$CKIPWS_DATA2/”).

lexfile (str)

the path to the user-defined lexicon file.

new_style_format (bool)

split sentences by newline characters (”n”) rather than punctuations.

show_category (bool)

show part-of-speech tags.


def ckipws.CkipWS.__call__(text, unicode=False)

Segment a sentence.

text (str)

the input sentence.

unicode (bool)

use Unicode for of input/output encoding; otherwise use system encoding.

return value (str)

the output sentence.


def ckipws.CkipWS.apply_list(text, unicode=False)

Segment a list of sentence.

ilist (str)

the list of input sentences (str).

unicode (bool)

use Unicode for of input/output encoding; otherwise use system encoding.

return value (str)

the list of output sentences (str).


def ckipws.CkipWS.apply_file(ifile, ofile, uwfile)

Segment a file.

ifile (str)

the input file.

ofile (str)

the output file (will be overwritten).

uwfile (str)

the unknown word file (will be overwritten).

CkipParser

class ckipparser.CkipParser(logger=False, inifile=None, wsinifile=None, data2dir=None, ruledir=None, rdbdir=None, do_ws=True, do_parse=True, do_role=True, lexfile=None, new_style_format=False, show_category=True)

The CKIP parser driver.

logger (bool)

enable logger (logger is not support in parser).

inifile (str)

the path to the INI file.

wsinifile (str)

the path to the INI file.

data2dir (str)

the path to the folder “Data2/” (default is “$CKIPWS_DATA2/”).

ruledir (str)

the path to the folder “Rule/” (default is “$CKIPPARSER_RULE/”).

rdbdir (str)

the path to the folder “RDB/” (default is “$CKIPPARSER_RDB/”).

do_ws (bool)

do word-segmentation.

do_parse (bool)

do parsing.

do_role (bool)

do role.

lexfile (str)

the path to the user-defined lexicon file.

new_style_format (bool)

split sentences by newline characters (”n”) rather than punctuations.

show_category (bool)

show part-of-speech tags.


def ckipparser.CkipParser.__call__(text, unicode=False)

Segment a sentence.

text (str)

the input sentence.

unicode (bool)

use Unicode for of input/output encoding; otherwise use system encoding.

return value (str)

the output sentence.


def ckipparser.CkipParser.apply_list(text, unicode=False)

Segment a list of sentence.

ilist (str)

the list of input sentences (str).

unicode (bool)

use Unicode for of input/output encoding; otherwise use system encoding.

return value (str)

the list of output sentences (str).


def ckipparser.CkipParser.apply_file(ifile, ofile)

Segment a file.

ifile (str)

the input file.

ofile (str)

the output file (will be overwritten).

FAQ

  • I don’t have CKIPWS/CKIP-Parser. What should I do?

Append --install-option='--no-ws' or --install-option='--no-parser' after the pip install command to disable CKIPWS or CKIP-Parser.

# Disable CKIPWS support
pip install pyckip --install-option='--no-ws'

# Disable CKIP-Parser support
pip install pyckip --install-option='--no-parser'
  • The CKIPWS throws “what(): locale::facet::_S_create_c_locale name not valid”. What should I do?

apt-get install locales-all

License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyckip-0.3.1.tar.gz (9.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page