A tool for converting raw text to slot
Project description
bert_slot_tokenizer
Version 0.2
bert_slot_tokenizer 是一个将slot filling 任务中slot解析为其他格式的工具
环境:
- Python 3
- Python 2
安装:
pip install bert-slot-tokenizer
支持的格式:
使用方法:
from bert_slot_tokenizer import SlotConverter
vacab_path = 'tests/test_data/example_vocab.txt'
# you can find a example here --> https://github.com/DevRoss/bert-slot-tokenizer/blob/master/tests/test_data/example_vocab.txt
sc = SlotConverter(vocab_path, do_lower_case=True)
text = 'Too YOUNG, too simple, sometimes naive! 蛤蛤+1s'
slot = {'name': '蛤蛤', 'time': '+1s'}
output_text, iob_slot = sc.convert2iob(text, slot)
print(output_text)
# ['too', 'young', ',', 'too', 'simple', ',', 'some', '##times', 'na', '##ive', '!', '蛤', '蛤', '+', '1', '##s']
print(iob_slot)
# ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-name', 'I-name', 'B-time', 'I-time', 'I-time']
写在最后:
感谢BERT对NLP领域的推动
感谢开源
欢迎PR和issue
联系方式: devross1997@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bert_slot_tokenizer-0.2.1.tar.gz
(11.9 kB
view hashes)
Built Distribution
Close
Hashes for bert_slot_tokenizer-0.2.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3563c048499f730aa25a7b15deb8bad1da2d654818e083f308acb685f93cf012 |
|
MD5 | 134472fcad2be020525f7ab6064b2b93 |
|
BLAKE2b-256 | 3e0c751e707569a76949a0d8c00fbef914b0d9e673d1b20c653cbb6ce20c2e43 |
Close
Hashes for bert_slot_tokenizer-0.2.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1bba9061c6fcbdb0174f871e1f9f59a9602ed9bd1f46f27dc9f7d88e9f1ff7a |
|
MD5 | a9eeb23828d5b51d92f6bb4691a35033 |
|
BLAKE2b-256 | ea4186bf89b1561ec0c8c93cb99f69fe0b54fe89d135ac985f004939094b9260 |