this project is a aho-corasick automaton implementation by python
Project description
ahocorasick-python
ac自动机python的实现,可用于python2 python3等主流python发行版,对标准的ac自动机算法进行了完善 优化(主要是改进了结果的准确性)。
注意:为了保证结果的准确性,请安装使用最新版(0.0.9)。
1.如何安装
pip 安装(推荐)
pip install ahocorasick-python
源码安装
git clone https://github.com/xizhicode/ahocorasick-python.git
cd ahocorasick-python && python setup.py install
2.如何使用
注: 此处python3为例,python2也是类似的结果
简单检索
import ahocorasick # 导入包
tree = ahocorasick.AhoCorasick("test","book","oo","ok", "k") # 构建ac自动机
print(tree.search("test book")) # 检索
输出结果:
{'test', 'k', 'oo', 'book', 'ok'}
检索并返回结果字符所在的位置(可以用于字符替换等场景)
import ahocorasick # 导入包
tree = ahocorasick.AhoCorasick("test","book","oo","ok", "k") # 构建ac自动机
print(tree.search("test book",True)) # 检索
输出结果:
{('k', (8, 9)), ('book', (5, 9)), ('oo', (6, 8)), ('ok', (7, 9)), ('test', (0, 4))}
3.参考资料
4.联系我
QQ: 943489924
邮箱:zhoukunpeng504@163.com
5. 注意
如果在windows平台上遇到了编码问题,删除所有的中文即可。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ahocorasick-python-0.0.9.tar.gz
.
File metadata
- Download URL: ahocorasick-python-0.0.9.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b2d5486e8c570c82a4659ca26b30d007c8919396af391b037bdd5085df51ddb |
|
MD5 | 855ffc9cca4d91774ddeb5af747de064 |
|
BLAKE2b-256 | 01489dc62a361a5f15d378b5ed3de12f3cfcce5a17f0507646a55d07a93bc0cf |