An Embedded CKIP Rasa NLU Components
Project description
rukip
An Embedded CKIP Rasa NLU Components
Introduction
This open-source library implements Rasa custom components.
It offerstokenizer and featurizer powered by ckiptagger as components in RasaNLU pipeline.
Installation
pip install rukip
rukip is a python library hosted on PyPI. Requirements:
- python >= 3.6
- rasa >= 1.4.0
- ckiptagger[tf] >= 0.0.19
Usage
Download model files
The model files are available on several mirror sites.
You can download and extract to the desired path by the following steps:
- Downloads to
./data.zip - Extracts to
./data/
Add CKIPTokenizer component into rasa nlu pipeline and configure model_path as ./data.
The following is the example Rasa NLU config file
language: "zh"
pipeline:
- name: "rukip.tokenizer.CKIPTokenizer"
model_path: "./data"
- name: "rukip.featurizer.CKIPFeaturizer"
model_path: "./data"
token_features: ["word", "pos"]
- name: "CRFEntityExtractor"
features: [["ner_features"], ["ner_features"], ["ner_features"]]
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
Components
CKIPTokenizer
This component has one required field (model_path) to be configured and offers two optional fields for user to assign dictionaries.
recommend_dict_pathis the file containing list of user-defined recommended-word. Default isNonecooerce_dict_pathis the file containing a list of must-word. Default isNone
The following is the example of user-defined dictionary. Each line shows one pair of word and weight.
土地公 1
土地婆 1
公有 2
來亂的 1
緯來體育台 1
CKIPFeaturizer
This component has one required field (model_path) to be configured and offers another optional fields for user to assign.
token_featuresis list of features extracted from ckiptagger to generatener_features. Default is["word", "pos"]
Development
$> git clone git@github.com:circlelychen/rukip.git
$> pip install -r requirements-to-freeze.dev.txt
$> make test
License
licensed under the GNU General Public License v3.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rukip-0.0.6.tar.gz.
File metadata
- Download URL: rukip-0.0.6.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47587b060d2131467a37b9f16ff90c6a2d625b32b28f5486860be354583e2327
|
|
| MD5 |
fca1c0825bfa894341a17d191cf92729
|
|
| BLAKE2b-256 |
8c1a37d7b86201e7462be4a97e347decba4cb29cc1584ada511147cc5677931a
|
File details
Details for the file rukip-0.0.6-py3-none-any.whl.
File metadata
- Download URL: rukip-0.0.6-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ba74e90c1f7a94c7c04cae799a04f93ad33b629b78857742348887ebdc7dc3d
|
|
| MD5 |
b255dab0d666c72f0a59008b23688137
|
|
| BLAKE2b-256 |
8b969bf8fe84ab040e39564f0a169f62d4457a9bc352ba7003a6568a348c72ae
|