Skip to main content

A tool for converting raw text to slot

Project description

bert_slot_tokenizer

Version 0.2

Travis (.org) GitHub

bert_slot_tokenizer 是一个将slot filling 任务中slot解析为其他格式的工具

环境:

  • Python 3
  • Python 2

安装:

pip install bert-slot-tokenizer

支持的格式:

使用方法:

from bert_slot_tokenizer import SlotConverter
vacab_path = 'tests/test_data/example_vocab.txt' 
# you can find a example here --> https://github.com/DevRoss/bert-slot-tokenizer/blob/master/tests/test_data/example_vocab.txt
sc = SlotConverter(vocab_path, do_lower_case=True)
text = 'Too YOUNG, too simple, sometimes naive! 蛤蛤+1s'
slot = {'name': '蛤蛤', 'time': '+1s'}
output_text, iob_slot = sc.convert2iob(text, slot)
print(output_text)
# ['too', 'young', ',', 'too', 'simple', ',', 'some', '##times', 'na', '##ive', '!', '蛤', '蛤', '+', '1', '##s']
print(iob_slot)
# ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-name', 'I-name', 'B-time', 'I-time', 'I-time']

写在最后:

感谢BERT对NLP领域的推动

感谢开源

欢迎PR和issue

联系方式: devross1997@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for bert-slot-tokenizer, version 0.2.1
Filename, size File type Python version Upload date Hashes
Filename, size bert_slot_tokenizer-0.2.1-py2.py3-none-any.whl (13.1 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size bert_slot_tokenizer-0.2.1.tar.gz (11.9 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page