Skip to main content

Syntax Error Data Enhancement

Project description

一键语法错误增强工具

欢迎使用一键语法错误增强工具,该工具可以进行14种语法错误的增强,不同行业可以根据自己的数据进行错误替换,来训练自己的语法和拼写模型。

使用:pip install ChineseErrorCorrector

开源不易,欢迎 star🌟

pypi:https://pypi.org/project/ChineseErrorCorrector/


介绍

一键语法错误增强工具,支持:

注意

如果没有进行数据增强,则返回None


API

1.缺字漏字

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.lack_word("小明住在北京"))

# 输出:小明在北京

2.错别字错误

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.wrong_word("小明住在北京"))
# 输出:小明住在北鲸

3.缺少标点

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.lack_char("小明住在北京,热爱NLP。"))
# 输出:小明住在北京热爱NLP。

4.错用标点

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.wrong_char("小明住在北京,热爱NLP。"))
# 输出:小明住在北京。热爱NLP。

5.主语不明

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.unknow_sub("小明住在北京"))
# 输出:住在北京

6.谓语残缺

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.unknow_pred("小明住在北京"))
# 输出:小明在北京

7.宾语残缺

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.lack_obj("小明住在北京,热爱NLP。"))
# 输出:小明住在北京,热爱。

8.其他成分残缺

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.lack_others("小明住在北京,热爱NLP。"))
# 输出:小明住北京,热爱NLP。

9.虚词多余

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.red_fun("小明住在北京,热爱NLP。"))
# 输出:小明所住的在北京,热爱NLP。

10.其他成分多余

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.red_component("小明住在北京,热爱NLP。"))
# 输出:小明住在北京,热爱NLP。,看着

11.主语多余

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.red_sub("小明住在北京,热爱NLP。"))
# 输出:小明住在北京,小明热爱NLP。

12.语序不当

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.wrong_sentence_order("小明住在北京,热爱NLP。"))
# 输出:热爱NLP。,小明住在北京

13.动宾搭配不当

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.wrong_ver_obj("小明住在北京,热爱NLP。"))
# 输出:None ,即无法进行此类错误的增强

14.其他搭配不当

from ChineseErrorCorrector.dat import GrammarErrorDat

cged_tool = GrammarErrorDat()
print(cged_tool.other_wrong("小明住在北京,热爱NLP。"))
# 输出:None, 即无法进行此类错误的增强

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chineseerrorcorrector-1.6.0.tar.gz (34.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ChineseErrorCorrector-1.6.0-py3-none-any.whl (34.4 MB view details)

Uploaded Python 3

File details

Details for the file chineseerrorcorrector-1.6.0.tar.gz.

File metadata

  • Download URL: chineseerrorcorrector-1.6.0.tar.gz
  • Upload date:
  • Size: 34.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.19

File hashes

Hashes for chineseerrorcorrector-1.6.0.tar.gz
Algorithm Hash digest
SHA256 f50da65ad753b5a05ee3f06d32daec0c80e2511820223425da5a622096361c9e
MD5 3e6614902dd4e0644f6858b139ecfb4f
BLAKE2b-256 673fa9003aecf1abd3d3da68102b5ec25049af7246744f409873d2cc6e5679f2

See more details on using hashes here.

File details

Details for the file ChineseErrorCorrector-1.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ChineseErrorCorrector-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 52b1c309cf5a498f018f4ea655e6c4fec178ed12306d1a3092f85e7043ee1dbc
MD5 7cbb737bc73fccf95c729e809c1e76e1
BLAKE2b-256 ebc93f6ef71dac5449920518ba2fd6af5e18cc7622a629a04597e40ea5fed73f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page