全国五级地址查询
Project description
PyUnit-Address
字符串地址查询,支持自定义地址词库
安装
pip install pyunit-address
说明
该算法有两个词库,一个是全国五级地址,统计时间是2018年。这个地址库是默认加载。不能删除也不能替换。
另一个词库是地址的简称词库。是可以替换、删除、追加的,默认的简称词库包括全国的省、市级。
如果需要提取非规则的地址,则实用深度模型: https://github.com/PyUnit/pyunit-ner
建议两者一起使用,互相补足。
使用
from pyunit_address import Address
def test():
address = Address(is_max_address=True)
af = address.find_address('我家在贵州遵义红花岗区,你家在贵州贵阳花溪区')
print(af)
if __name__ == '__main__':
test()
自定义增加词库
from pyunit_address import Address
def test_add():
address = Address(is_max_address=True)
address.add_vague_text('红花岗') # 在默认词库上追加地址词库
address.add_vague_text('花溪')
# address.add_vague_text(['红花岗', '花溪']) # 在默认词库上追加地址词库
af = address.find_address('我家在贵州遵义红花岗区,你家在贵州贵阳花溪')
print(af)
if __name__ == '__main__':
test_add()
自定义加载词库
from pyunit_address import Address
def test_load():
address = Address(is_max_address=True)
address.set_vague_text(['红花岗', '花溪']) # 加载词库列表,替换默认词库
# address.set_vague_text('自定义词库.txt') # 加载词库文件,替换默认词库
af = address.find_address('我家在贵州遵义红花岗区,你家在贵州贵阳花溪')
print(af)
if __name__ == '__main__':
test_load()
自动补全地址
from pyunit_address import Address
def test_supplement_address():
address = Address(is_max_address=True)
asu = address.supplement_address('我家在遵义') # 贵州省-遵义市
print(asu)
if __name__ == '__main__':
test_supplement_address()
TODO
- 自动寻找最长地址长度
- 全国五级地址新词库
- 支持自定义地址词库
- 不支持非规则地址
- 支持地址自动补全
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for pyunit_address-2020.2.25-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 138c298fca57b05c7f259ccf087f6ef1c74187c80b4df00e0c5c95b0c82f2ac9 |
|
MD5 | 2c6411cde44efae19b00b97572e94eb5 |
|
BLAKE2b-256 | 23d2bb29593b0d82c4aeb4ca206a11e9ea2f369e6cb8cc6e17011df34b4f4b55 |