An open-source Chinese NLP Dataset Reader library, built on allennlp & pytorch.
Project description
chreader
中文自然语言处理数据集工具包
优秀特性
- 易用
- 支持自动下载和缓存,一行命令即可获得指定数据集
- 支持命令行的方式展示已有数据集及其详细描述
- 无缝衔接
allennlp
、catalyst
、pytorch_lightning
、FARM
等常用 NLP 框架
- 丰富,支持分类、生成、标注等多种类型数据集,共计 2 种
- 灵活
- 可以自由添加自定义数据集,只需继承
ChDatasetReader
即可 - 借助
allennlp
可使用各种 tokenizer、token_indexer、vocab 等组件,并对其进行高级配置
- 可以自由添加自定义数据集,只需继承
安装
- git
git clone https://github.com/wangyuxinwhy/chreader.git pip install -e .
- pip
pip install -U chreader
使用
构建 Dataset & DataLoader
from chreader import load_dataset, DataLoader
train_dataset = load_dataset("tnews", "train")
dev_dataset = load_dataset("tnews", "dev")
train_dataloader = DataLoader(train_dataset, batch_size=32)
dev_dataloader = DataLoader(dev_dataset, batch_size=32)
for data in train_dataloader:
...
命令行
// 列出所有可用数据集
chreader list
// 展示数据集详细信息
chreader show tnews
TODO
- 添加更多数据集
- 添加 dataset_type 字段,现在只有 classification 一种
- classification
- sentiment
- generation
- summarization
- tagging
- ner
- dependency_parsing
- classification
- 支持外部的配置
- 美化命令行的输出
- 录一个 gif
- 添加 docs
- 添加 tutorial
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
chreader-0.2.1.tar.gz
(10.9 kB
view details)
Built Distribution
chreader-0.2.1-py3-none-any.whl
(12.2 kB
view details)
File details
Details for the file chreader-0.2.1.tar.gz
.
File metadata
- Download URL: chreader-0.2.1.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f0854cb3f5b83dec0deefee2673ae8490535c3c985cbed099406066c2fe3015 |
|
MD5 | 8fad0976f64d26e395546c9b7f46bb7e |
|
BLAKE2b-256 | c91219409e11cf54c1883a1cb903c401d8870aa752c1b49b81b3845fa3adcd1b |
File details
Details for the file chreader-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: chreader-0.2.1-py3-none-any.whl
- Upload date:
- Size: 12.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29e9dc782d9727f83b07cca75eed86e7794fa25f49036f517d162a43f405e30b |
|
MD5 | d3ada5a22b61cab694cd8fbde7d82732 |
|
BLAKE2b-256 | 14924dff1e8b57a729947c8eed2cc179574e8e73d8aa386af080981aac36d486 |