Skip to main content

An open toolkit of universal extraction from text.

Project description

OpenUE Pytorch

用户可通过以下几个简单的步骤实现基于OpenUE的抽取模型训练和部署

  1. 下载ske数据集
mkdir dataset
wget  http://47.92.96.190/dataset/ske.tar.gz 
tar zxvf ske.tar.gz -C ske
包含all_50_schemas, train.json, test.json, dev.json四个文件
  1. 数据预处理
下载预训练语言模型 (e.g., [bert-base-chinese](https://github.com/google-research/bert)) 并放置到对应文件夹
  1. 训练分类模型
python run_seq.py --model_name_or_path $pretrained_model_path/bert-base-chinese --data_dir ./dataset --output_dir ./output_seq --save_steps 5000 --num_train_epochs 30 --per_device_train_batch_size 16 --per_device_eval_batch_size 16 --do_train --task seq
  1. 训练序列标注模型
python run_ner.py --model_name_or_path $pretrained_model_path/bert-base-chinese --data_dir ./dataset --output_dir ./output_ner --save_steps 2000 --num_train_epochs 20 --per_device_train_batch_size 16 --per_device_eval_batch_size 32 --do_train --task ner
  1. 模型集成与测试
python Interactive.py --seq_model_path ./output_seq/ --ner_model_path ./output_ner/ --output_dir ./output/ --data_dir ./dataset -per_device_eval_batch_size 1 --do_predict --task interactive

引用

如果您使用或扩展我们的工作,请引用以下文章:

@inproceedings{zhang-2020-opennue,
    title = "{O}pe{UE}: An Open Toolkit of Universal Extraction from Text",
    author = "Ningyu Zhang, Shumin Deng, Zhen Bi, Haiyang Yu, Jiacheng Yang, Mosha Chen, Fei Huang, Wei Zhang, Huajun Chen",
    year = "2020",
}


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openue-0.0.4.tar.gz (17.9 kB view hashes)

Uploaded Source

Built Distributions

openue-0.0.4-py3.8.egg (50.1 kB view hashes)

Uploaded Source

openue-0.0.4-py3-none-any.whl (19.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page