Skip to main content

An Embedded CKIP Rasa NLU Components

Project description

rukip

An Embedded CKIP Rasa NLU Components

Introduction

This open-source library implements Rasa custom components.

It offerstokenizer and featurizer powered by ckiptagger as components in RasaNLU pipeline.

Installation

pip install rukip

rukip is a python library hosted on PyPI. Requirements:

  • python >= 3.6
  • rasa >= 1.4.0
  • ckiptagger[tf] >= 0.0.19

Usage

Download model files

The model files are available on several mirror sites.

You can download and extract to the desired path by the following steps:

  1. Downloads to ./data.zip
  2. Extracts to ./data/

Add CKIPTokenizer component into rasa nlu pipeline and configure model_path as ./data.

The following is the example Rasa NLU config file

 language: "zh"
 pipeline:
   - name: "rukip.tokenizer.CKIPTokenizer"
     model_path: "./data"
   - name: "rukip.featurizer.CKIPFeaturizer"
     model_path: "./data"
     token_features: ["word", "pos"]
   - name: "CRFEntityExtractor"
     features: [["ner_features"], ["ner_features"], ["ner_features"]]
   - name: "CountVectorsFeaturizer"
   - name: "EmbeddingIntentClassifier"

Components

CKIPTokenizer

This component has one required field (model_path) to be configured and offers two optional fields for user to assign dictionaries.

  • recommend_dict_path is the file containing list of user-defined recommended-word. Default is None
  • cooerce_dict_path is the file containing a list of must-word. Default is None

The following is the example of user-defined dictionary. Each line shows one pair of word and weight.

土地公 1
土地婆 1
公有 2
來亂的 1
緯來體育台 1

CKIPFeaturizer

This component has one required field (model_path) to be configured and offers another optional fields for user to assign.

  • token_features is list of features extracted from ckiptagger to generate ner_features. Default is ["word", "pos"]

Development

$> git clone git@github.com:circlelychen/rukip.git
$> pip install -r requirements-to-freeze.dev.txt
$> make test

License

licensed under the GNU General Public License v3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for rukip, version 0.0.6
Filename, size File type Python version Upload date Hashes
Filename, size rukip-0.0.6-py3-none-any.whl (19.4 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size rukip-0.0.6.tar.gz (4.3 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page