Skip to main content

Syntax Error Data Enhancement

Project description

语法错误数据增强

针对行业语法错误数据稀缺问题,提出语法错误替换方法,可以根据领域的数据进行语法错误制作,定制化行业模型。 目前支持缺字漏字、错别字错误、缺少标点、错用标点、主语不明、谓语残缺、宾语残缺、其他成分残缺、主语多余、虚词多余、其他成分多余、语序不当、动宾搭配不当、其他搭配不当等14种细粒度错误类型的替换。如下图所示: image

也可以对一个句子进行多种错误的替换,如下:

image

模型文件下载

pre_model下的ltp_small,下载地址:https://huggingface.co/LTP/small

获得2024CCL Task7 一等奖

2024CCL Task7: https://github.com/cubenlp/2024CCL_CEFE

博客经验分享:https://www.cnblogs.com/twnlp/p/18208637

评测论文:待发表

Project details


Release history Release notifications | RSS feed

This version

1.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grammarenhancer-1.1.tar.gz (7.6 kB view hashes)

Uploaded Source

Built Distribution

GrammarEnhancer-1.1-py3-none-any.whl (7.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page