中文錯誤類型文字增量
Project description
錯誤類型中文語料生成
安裝
pip install zh-mistake-text-gen
使用 (Pipeline)
from zh_mistake_text_gen import Pipeline
pipeline = Pipeline()
incorrect_sent = pipeline("中文語料生成")
print(incorrect_sent)
# type='PronounceSimilarVocabMaker' correct='中文語料生成' incorrect='鍾文語料生成' incorrect_start_at=0 incorrect_end_at=2 span='鍾文'
文檔
Pipeline
-
__init__makers= None : maker實例,可選maker_weight= None : maker被抽中的機率,可選
-
__call__x: 輸入句(str),必需error_per_sent: 每句要多少錯誤。預設:1no_change_on_gen_fail: 生成方法失敗的時候允許不變動。啟用時不拋出錯誤,反之。預設:Falseverbose=True : debug 訊息,可選
可用方法
from zh_mistake_text_gen.data_maker import *
| Data Maker | Description |
|---|---|
| NoChangeMaker | 沒有任何變換 |
| MissingWordMaker | 隨機缺字 |
| MissingVocabMaker | 隨機缺詞 |
| PronounceSimilarWordMaker | 隨機相似字替換 |
| PronounceSimilarWordPlusMaker | 編輯距離找發音相似並且用高頻字替換 |
| PronounceSimilarVocabMaker | 發音相似詞替換 |
| PronounceSimilarVocabPlusMaker | 編輯距離找發音相似發音相似詞替換 |
| PronounceSameWordMaker | 發音相同字替換 |
| PronounceSameVocabMaker | 發音相同詞替換 |
| RedundantWordMaker | 隨機複製旁邊一個字作為沆於字 |
| RandomInsertVacabMaker | 隨機插入詞彙 |
| MistakWordMaker | 隨機替換字 |
| MistakeWordHighFreqMaker | 隨機替換高頻字 |
| MissingWordHighFreqMaker | 隨機刪除高頻字 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zh_mistake_text_gen-0.3.6.tar.gz
(20.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zh_mistake_text_gen-0.3.6.tar.gz.
File metadata
- Download URL: zh_mistake_text_gen-0.3.6.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.4 Linux/5.4.0-1094-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50a188030417bebd38bc0bb450f2980b86e5c07b0fe7c97a23cd4939dfa53520
|
|
| MD5 |
af350f86e85e158e17cd117d064dddb8
|
|
| BLAKE2b-256 |
a268e57000a0597ce24b6a973ef704a4bda32947ba23afd73c758ef3bfff63d5
|
File details
Details for the file zh_mistake_text_gen-0.3.6-py3-none-any.whl.
File metadata
- Download URL: zh_mistake_text_gen-0.3.6-py3-none-any.whl
- Upload date:
- Size: 19.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.4 Linux/5.4.0-1094-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31c5021b668d317a11652c0a91bad50f6d9710924bbde651d7f5339740ed45fc
|
|
| MD5 |
94c21bc060d3ec33b14418329df4d467
|
|
| BLAKE2b-256 |
d610ff6c439fb3e95cb8549d312434b004d93e0102ef5ca664bb5b648dff818b
|