rust-participle
Project description
Nazrin
中文分词工具 jieba-rs Binding of Python
相比纯 Python 实现的 jieba,速度更快,在分词过程中释放了 GIL,可适用于多线程处理
安装
pip install nazrin
用法
from nazrin import Nazrin
nazrin = Nazrin()
print(nazrin.cut('能找到想找的东西程度的能力'))
# ['能', '找到', '想', '找', '的', '东西', '程度', '的', '能力']
print(nazrin.tag('能找到想找的东西程度的能力'))
# [('能', 'v'), ('找到', 'v'), ('想', 'v'), ('找', 'v'), ('的', 'uj'), ('东西', 'ns'), ('程度', 'n'), ('的', 'uj'), ('能力', 'n')]
全部方法介绍
class Nazrin:
def __init__(self) -> None: ...
def add_word(
self, word: str, freq: int | None = None, tag: str | None = None
) -> int:
"""
说明:
把一个词加进字典。
参数:
* ``freq``: 词频,默认为计算值
* ``tag``: 词性,默认为 None
"""
...
def load_userdict(self, path: str) -> None:
"""
说明:
加载用户字典
参数:
* ``path``: 字典路径
"""
...
def suggest_freq(self, word: str) -> None:
"""
说明:
建议词频,以强制词语中的字符连接或分离。
参数:
* ``word``: 词语
"""
...
def cut(self, text: str, hmm: bool = True) -> list[str]:
"""
说明:
将包含汉字的整个句子分割成独立的单词,精确模式
参数:
* ``text``: 文本
* ``hmm``: 是否使用隐马尔可夫模型. 默认为 True.
"""
...
def cut_all(self, text: str) -> list[str]:
"""
说明:
将包含汉字的整个句子分割成独立的单词,完整模式
参数:
* ``text``: 文本
"""
...
def cut_for_search(self, text: str, hmm: bool = True) -> list[str]:
"""
说明:
将包含汉字的整个句子分割成独立的单词,搜索引擎模式
参数:
* ``text``: 文本
* ``hmm``: 是否使用隐马尔可夫模型. 默认为 True.
"""
...
def tag(self, text: str, hmm: bool = True) -> list[tuple[str, str]]:
"""
说明:
给文本打词性标签
参数:
* ``text``: 文本
* ``hmm``: 是否使用隐马尔可夫模型. 默认为 True.
"""
...
def tokenize(
self,
text: str,
mode: Literal["search", "default"] = "default",
hmm: bool = True,
) -> list[str]:
"""
说明:
Tokenize the text
参数:
* ``text``: 文本呢
* ``mode``: 模式. 默认为 "default".
* ``hmm``: 是否使用隐马尔可夫模型. 默认为 True.
"""
性能对比
In [1]: import jieba
In [2]: jieba.initialize()
Building prefix dict from the default dictionary ...
Loading model from cache jieba.cache
Loading model cost 0.647 seconds.
Prefix dict has been built successfully.
In [3]: from nazrin import Nazrin
In [4]: nazrin = Nazrin()
In [5]: with open("./docs/performance-test.txt", "r", encoding="utf-8") as f:
...: data = f.read()
...:
In [6]: %timeit list(jieba.cut(data))
3.77 ms ± 109 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [7]: %timeit nazrin.cut(data)
283 µs ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
鸣谢
naidesu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nazrin-0.1.0.tar.gz
(105.6 kB
view hashes)
Built Distributions
Close
Hashes for nazrin-0.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f921c2fadeb1d2c0d56cb49e7d1644c0a01c3f348ab242c3e7f706868d53b38 |
|
MD5 | 7561f19fe938a0987d0f2b42cb7a1e30 |
|
BLAKE2b-256 | 909b922e8b2ab13e955a196739dcc4e43df47f067b77013af4ea63401e924f96 |
Close
Hashes for nazrin-0.1.0-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 356378eb36d1be39eea7e02bb13c4ccfc01a16c29ceaf4d682225ca8939c5ae9 |
|
MD5 | f72c72e11e328f12841c0e5c1e3111a0 |
|
BLAKE2b-256 | 8e133cecfa64fdbedd033c5be9f728dbf11af404551be19ac7b3a2c771d20cf2 |
Close
Hashes for nazrin-0.1.0-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59652fac7e548d874e127698fc8759ae69492687357b2b20ed5318b7807d865d |
|
MD5 | 5dcdd5ae998c412ab6ee2611235be166 |
|
BLAKE2b-256 | 0be8dd3e33a7a5a576f7e1b9ef97e76c1d07094bf0757c99078661ebc33e9fed |
Close
Hashes for nazrin-0.1.0-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4be702756fae8f615357ebfca46f97b60c3fdbfce2e427f310116f9b1ebb979 |
|
MD5 | f0cceee80e8af831df144e7d32fad921 |
|
BLAKE2b-256 | 336fc22293bd353451c531422535e32816f38e4b93ec2d023df86faa9ba0f5bb |
Close
Hashes for nazrin-0.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa6efe3e328e4219415c5d2be99bd2bfb92d1c6eefebcec68eca1f299ae5f95c |
|
MD5 | 65662be083107b98248c920879499de2 |
|
BLAKE2b-256 | a38ce0bd69b9561c53c4ef931e7b0016280733f0bb6d390c55922af68f8e367b |
Close
Hashes for nazrin-0.1.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1630c07ea0826496d12327b96355f26621deb87b388488e6406c76e48bee27d5 |
|
MD5 | 9b325779805a4b5422ded5ea704a4fb9 |
|
BLAKE2b-256 | 841e01d63036abf50d1bf09dc0757e4efaefe00369459665a2971332f88fb275 |
Close
Hashes for nazrin-0.1.0-cp312-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b70bbd8a620be81dce9db97bf11b1f774d50054d2db3ac0ca386dc9c7f1d5d2e |
|
MD5 | 5ba00e583807c23b2dd433a8550110e8 |
|
BLAKE2b-256 | c84253f95020d2678b91a780acfe6f7e90b16dd1bbe9108531d3e9e0764a2e09 |
Close
Hashes for nazrin-0.1.0-cp312-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83c97b71ac12c5eb47cfc780c1b7366798647a805230b6f9fff403c79c56b435 |
|
MD5 | f32c58c6dbca3d8703678abbb8774915 |
|
BLAKE2b-256 | 96eab98af85c4eee80a0638f0bfe7c1d0f380dd5d5dc7f4d5991ef908aaffe10 |
Close
Hashes for nazrin-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b499d20da243773b69ddf2128ae47cba6195a077bafd52ba878084a146d68db5 |
|
MD5 | 8bba3d9653c6031ecca5ca5f6dedf87b |
|
BLAKE2b-256 | 656cee90098b36f5e557afcdb5e948e4da09872ba753662ca28906af2af6666d |
Close
Hashes for nazrin-0.1.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd55c851da86b5266200ac365ee38deee5ff07220c3147ba5b3f8c287807e150 |
|
MD5 | 29fd356c63f127d4370a3d43be95c697 |
|
BLAKE2b-256 | 7f958964a73fba50dc38c328c59b62dd1bfa3515681b72affb199cbc025f9d3e |
Close
Hashes for nazrin-0.1.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eddc9450459d558658e7b3521ea2273579ee3e43aba4051910eb9d103689387f |
|
MD5 | bba7cc97e2bbb8e206b705138422b7f7 |
|
BLAKE2b-256 | 50fc26451f5afdad0df2fabd1a842ccaa383017a25efdd9515f9e701e47d77d7 |
Close
Hashes for nazrin-0.1.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 030e71059618cba32332072ff4a40c9aa025029de14015b3c78073b2be6ca894 |
|
MD5 | 9e25aa13d40b256ac4c72ab1bfada627 |
|
BLAKE2b-256 | 3c935bcd5c86be230962a1de5419b0d2d31f2e0c7939b14836551a64062ce969 |
Close
Hashes for nazrin-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 02539a4a92d111cece80928ff5a8470ab960cca3ce5ea907a508b5a9f0827eb5 |
|
MD5 | e916d3319d5dd280628ae5813de4670c |
|
BLAKE2b-256 | 97862b07f37276b4a6cbd5bb39384525c2de02d6a6ea7a44d02c30bdb64e5ce2 |
Close
Hashes for nazrin-0.1.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c6d3ec20448bdafe82b5b98ce16c8c7d05b93a56ca38fd3d4e490d9097489f1 |
|
MD5 | f125b40339d519acdcc48804064b496d |
|
BLAKE2b-256 | 0f94e0de6842ebef842677b466220bcd19ea3ee296224ab5d9793ec11b798324 |
Close
Hashes for nazrin-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 865c05aaac1ab43d9738edb7146a597ee1c6d601fe9137ca7b1765206b675858 |
|
MD5 | a5b77472f0673c06ab3a00d2e786d9b5 |
|
BLAKE2b-256 | cb601e66c1b181bc3654a5493df7f50b8cbe996a02d8112b4624e6df833714b1 |
Close
Hashes for nazrin-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d420cc4300dd69c50d984640f06e57e6c23b0fa96cf2589dac9c6a447e56230d |
|
MD5 | 545ccbd1a69777c29b5ae0cd88e257cb |
|
BLAKE2b-256 | 4051711195ee7495597a6cc5acca49b973a8395a2530d87214168ce3657b0f0d |
Close
Hashes for nazrin-0.1.0-cp311-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 452c342be4a8d421f4903d0b464f81400431a806c4ef260584b43a16bf09a647 |
|
MD5 | 323c7c20da54ed0053d81bf3b5962cdf |
|
BLAKE2b-256 | 4f744d73110012030975d5dd9c658409469989922d5d59f1c54a5978c26908c2 |
Close
Hashes for nazrin-0.1.0-cp311-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfa4ae2cfc262162cb03a3abc479ca445f032a138e86bf5de168595fb7ee7ad5 |
|
MD5 | d0573758f2a1618ca32a9be5ea4e8c6d |
|
BLAKE2b-256 | aab452dea17f58d4405d46eefcd87e0d7e74c2a2a0ce05dfd4c7907bc59aa259 |
Close
Hashes for nazrin-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5391e2c6f2c63ccd74d92e2ecadb23a563ce8d4b890fd1d8b8c7e9df25ee0e7 |
|
MD5 | 82d426453bd4ca93ce480e468f2fb16c |
|
BLAKE2b-256 | b4ea4dd8bc14819831a3099566df2c738beb70b89a63681ac88f3b725dd99720 |
Close
Hashes for nazrin-0.1.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0607bbcaf2753017bf78f0876515898a7d6d2f397fb4223b978f8f9ae16eefb5 |
|
MD5 | 2b704b1f66e34cb3950457c0bbd2cf2b |
|
BLAKE2b-256 | 9ea9f3c2bb8b79652011b4c2fcea053c1155f0cefb579800a242516fcc0e893e |
Close
Hashes for nazrin-0.1.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 553241ace11da48ed5534f756b15b5c728a26d2e9b2f68b5525375cfeec4008d |
|
MD5 | 753d1a35ec618f199d4a0aee8b1026ca |
|
BLAKE2b-256 | 89f5b2bed6cbf8d6acece680ab57e6a5042f1255e01b23a0a7b1759368b7c5d3 |
Close
Hashes for nazrin-0.1.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d5f229909d2dd798ee6ce49fb5c21289824e201c91c8e02b719decb7ababca5 |
|
MD5 | b20fa43ef73b7dbc1ba4a2bda802a3b2 |
|
BLAKE2b-256 | 186c342560ccdfd44ae55b34944f9c8c91cf0f69a85265c56e70b9873c0c5ac8 |
Close
Hashes for nazrin-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0172ff69a8c34fdff2eb5c021c00ffb2b100cd3eadc964c0f2dc7c66cda8ef77 |
|
MD5 | a0acbbc1425f67a018a3a2b8717761cd |
|
BLAKE2b-256 | 2c6e8e2d0dd153b1a6ee072315e972bbb544cc1103dd13f32ea63e69791cb4ca |
Close
Hashes for nazrin-0.1.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87388281cc7c38a31225d1311f785b5e2dc98a8b5e5e44ff67a67ccd4e4cb8e8 |
|
MD5 | 25667084d4be94e8619af55c590a259d |
|
BLAKE2b-256 | 4fc0e7543af0566c5463a3d69eaa1e83bdbf832c5bb76dc24c72af2bb1724503 |
Close
Hashes for nazrin-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08b66b2fcb8a880535aa3307b296eb43f8eb69cfbedfba7da9583ff034c20d9f |
|
MD5 | c192d1f12012e9bc8aa9617ce86065cf |
|
BLAKE2b-256 | 47f673f3d255c1827d67dcf45bb474a33d4c2c1aef68a3c6ec453b1caedb5342 |
Close
Hashes for nazrin-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f633694882c9fff0e3ba0e4924fcc05dfa7ad00fd3c57983d0dbebd13e85563 |
|
MD5 | 38539fe6ff68b5f961a5ac10c00d7ba5 |
|
BLAKE2b-256 | fcbec63e9ea7aed04fbbc280e92b62758c25be5a923bad92072a6a6fb7f85ae5 |
Close
Hashes for nazrin-0.1.0-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f78d1aefbfaec38e38bcbe8de59bb4b57f888c656d62fea5bc4a44d9faf4d8ef |
|
MD5 | 187bc89973434e1f4e7ef7ef0d429eac |
|
BLAKE2b-256 | 9ae19bc0b7a2af8e726c29a2cc85ad0f23b0ccfd05832d3b289d4da5711f5837 |
Close
Hashes for nazrin-0.1.0-cp310-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b8db8b20c9723a2e4994aff70cb7f4b895dd32e9b0225ca364aeccec5c511f2 |
|
MD5 | 5c71bde1a20ec7bb6cb33435661df0ae |
|
BLAKE2b-256 | b54f1ac01aeb1d683644437b02741968b3a7fc03115ff39dc7ea7bb20579c380 |
Close
Hashes for nazrin-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6eca7dd59f2ba2e8e823cfddb5cdc02acc1071da8897ea16e99acd0157fdb4d6 |
|
MD5 | 6c6ac10e1e81d5fea4ee7d91f5a3eee6 |
|
BLAKE2b-256 | 2b39847950995eeedd2ee9f3d8042bffc1c98249baed653583d2ce6db2280ff1 |
Close
Hashes for nazrin-0.1.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 336eab505e68576cf25d4afca1e2f2a65271c28f4bae2ac02f6afceac23bede4 |
|
MD5 | 022246fdbcff7b69595257ac6c6775ca |
|
BLAKE2b-256 | a5f4e144e49dc5c978f1ea3d30b8ce3e4c2ca02754a15d9f0da3be5abb3e3f4a |
Close
Hashes for nazrin-0.1.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | edc2c3a553e5e10f6e0b6d7fc3bb7ed7bcdaac15621de368ada303f879e6bd52 |
|
MD5 | fa18f768a1607495340ce0e5ab06fde8 |
|
BLAKE2b-256 | 93242dec6cd9ebd388304d291f978af6e4e11799f77a8245801d954d973c9067 |
Close
Hashes for nazrin-0.1.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8cc21777edd2d3053fb5134d714bbce4448e03d6c198bc8b9ee5d6a30b6b0d27 |
|
MD5 | 85dd2b791425a7794c89f9ceded3af12 |
|
BLAKE2b-256 | b2503b406a9d0f3adee2a42eb116169bf7ecc3ca9aa1848e8e67d28cf6824d21 |
Close
Hashes for nazrin-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2f5e064bb8c73904cc8defc50d45435a464f362c21d29cabf1469e90597112d |
|
MD5 | 48c2087bf0a5cad7ff0fd9d83dcd4a4e |
|
BLAKE2b-256 | 4ed5510162e4b5b8629b7fd5081242c053e55933ab6c1978e9ca58e88d40b214 |
Close
Hashes for nazrin-0.1.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c5f92d44ceb079ecf2da296e634c08eee1b284d51826289f26face0a40b3809 |
|
MD5 | e4568e64589f31c6e6b5bb8f112e4c7c |
|
BLAKE2b-256 | cd0e54733958ade8de4550946a62acb125a8b5db2ad87499f88d4698aa62718f |
Close
Hashes for nazrin-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5124a0bd44db1091f98bfc947b6a353e5ef6cb25ea2414a513b4750a62b2d2f4 |
|
MD5 | 834889445c266951a11443671a8ad599 |
|
BLAKE2b-256 | e32e9e5d6baa1eff94e002c1cf28ba9832b5ae74d79577a1a672f86e6678d7c2 |
Close
Hashes for nazrin-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 697d722f118e32645383ab25322905330b23153d6bfa7476521ecb98db7471e9 |
|
MD5 | 7be892b3f27b5cf6ec7b973e8394569a |
|
BLAKE2b-256 | 4bb442604c0dcec297b15dca20731f084ecfdd55835694477406726fde5a3b75 |