lightsmile's nlp label library
Project description
lightLabel
一个自己使用的标注系统(后端)
0. 声明
本项目目前只计划开放当前版本源码,以后的代码应该会闭源
1. 简介
该标注系统主要用于简单的标注任务,数据的标注信息会实时同步到数据库中,目前已经基本实现了文本分类任务(样本类别数量不大)。
2. 功能特性
- 系统各功能层次耦合度较低,将标注任务抽象成资源类。
- 将数据库访问功能也封装了抽象类,并且提供了MongoDB的默认实现,理论上可以自由添加其他数据库的实现。
- 采用flask作为web服务框架,较严格采用restful设计风格,系统会为每个标注任务都提供相关restAPI接口。
3. 使用示例
示例代码
from lightlabel import Engine, TextClassification
text_cls = TextClassification('ttt_demo', 'des_demo')
text_cls.add_classes(['唐朝人物', '虚拟人物', '三国人物'])
text_cls.update_from_csv(r'C:\Users\Alienware\Desktop\text_classification_demo.csv', headers=['word'])
engine = Engine()
engine.add_plan(text_cls)
engine.run()
运行结果
* Running on http://localhost:5000/ (Press CTRL+C to quit)
{'ttt_demo'}
* Serving Flask app "lightlabel.web.engine" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
127.0.0.1 - - [21/Feb/2020 18:24:52] "GET /ttt_demo/items HTTP/1.1" 200 -
127.0.0.1 - - [21/Feb/2020 18:25:31] "GET /ttt_demo/data HTTP/1.1" 200 -
127.0.0.1 - - [21/Feb/2020 18:30:57] "GET /project_lists HTTP/1.1" 200 -
127.0.0.1 - - [21/Feb/2020 18:30:58] "GET /favicon.ico HTTP/1.1" 200 -
text_classification_demo.csv
文件中内容
李白
曹操
夏侯惇
张飞
周瑜
陆逊
司马懿
数据库中该文本分类任务的数据内容(数据略有不符,因为我标注了几例)
/* 1 */
{
"_id" : ObjectId("5e4d411fc827b0709d554149"),
"check_status" : false,
"labeled_data" : "唐朝人物",
"labeled_status" : true,
"labeled_user" : null,
"raw_data" : {
"word" : "李白"
},
"updated_time" : null
}
/* 2 */
{
"_id" : ObjectId("5e4d411fc827b0709d55414b"),
"check_status" : false,
"labeled_data" : "三国人物",
"labeled_status" : true,
"labeled_user" : null,
"raw_data" : {
"word" : "曹操"
},
"updated_time" : null
}
/* 3 */
{
"_id" : ObjectId("5e4d411fc827b0709d55414d"),
"check_status" : false,
"labeled_data" : "三国人物",
"labeled_status" : true,
"labeled_user" : null,
"raw_data" : {
"word" : "夏侯惇"
},
"updated_time" : null
}
/* 4 */
{
"_id" : ObjectId("5e4d411fc827b0709d55414f"),
"check_status" : false,
"labeled_data" : null,
"labeled_status" : false,
"labeled_user" : null,
"raw_data" : {
"word" : "张飞"
},
"updated_time" : null
}
/* 5 */
{
"_id" : ObjectId("5e4d411fc827b0709d554151"),
"check_status" : false,
"labeled_data" : null,
"labeled_status" : false,
"labeled_user" : null,
"raw_data" : {
"word" : "周瑜"
},
"updated_time" : null
}
/* 6 */
{
"_id" : ObjectId("5e4d411fc827b0709d554153"),
"check_status" : false,
"labeled_data" : null,
"labeled_status" : false,
"labeled_user" : null,
"raw_data" : {
"word" : "陆逊"
},
"updated_time" : null
}
/* 7 */
{
"_id" : ObjectId("5e4d411fc827b0709d554155"),
"check_status" : false,
"labeled_data" : null,
"labeled_status" : false,
"labeled_user" : null,
"raw_data" : {
"word" : "司马懿"
},
"updated_time" : null
}
数据库中该文本分类任务对应的任务信息
/* 1 */
{
"_id" : ObjectId("5e4d411fc827b0709d554143"),
"description" : "des_demo",
"label_status" : true,
"task_type" : "TextClassification",
"title" : "ttt_demo",
"data" : {
"classes" : [
"唐朝人物",
"虚拟人物",
"三国人物"
]
},
"data_path" : [
"C:\\Users\\Alienware\\Desktop\\text_classification_demo.csv"
]
}
rest接口
http://localhost:5000/project_lists
:返回所有任务信息http://localhost:5000/ttt_demo/items
:该任务所有标注条目http://localhost:5000/ttt_demo/data
:该任务相关数据,如在该例中为所有标签类别组成的列表
具体如下截图:
4. 参考
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lightLabel-0.1.0.tar.gz
(7.6 kB
view details)
Built Distribution
File details
Details for the file lightLabel-0.1.0.tar.gz
.
File metadata
- Download URL: lightLabel-0.1.0.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.19.5 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15b419966c54f04c7f2c0135789501dfa342f887fdc305d786daba796f741270 |
|
MD5 | 90f5b298405c0540b727ecf850015701 |
|
BLAKE2b-256 | bc1675fe0ddf4231dc11ed4d5f8b614f118312bd6aa222a294a475e8b7a635a4 |
File details
Details for the file lightLabel-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: lightLabel-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.19.5 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3671a451c90546c0db418942e2d498a37991ce2fc89da805a8d02a3763ecc96a |
|
MD5 | cc9fb07d9409036ebd4536c2d1df6a9d |
|
BLAKE2b-256 | 273e08540c3da320e6e966bf36e737b8990d7f02bb7d42e486bd566607d44001 |