Skip to main content

全国五级地址查询

Project description

PyUnit-Address

字符串地址查询,支持自定义地址词库

安装

pip install pyunit-address

说明

该算法有两个词库,一个是全国五级地址,统计时间是2019年。这个地址库是默认加载。不能删除也不能替换。
如果需要提取非规则的地址,则实用深度模型:  https://github.com/PyUnit/pyunit-ner
建议两者一起使用,互相补足。

测试

from pyunit_address import *
import time

address = Address(is_max_address=True)
address.add_vague_text(['红花岗', '花溪'])  # 加入地址名称
address.add_vague_text('贵州省-遵义市-遵义县-虾子镇-乐安村-乐石台')  # 加入一串有顺序的地址
address.add_vague_text('自定义词库.txt')  # 加载词库文件,词库文件中的每一行,可以是一串顺序地址,也可以是一个地址


def all_test():
    string_ = '我家在红花岗,你家在贵州贵阳花溪区,他家在贵州省遵义市花溪区'
    finds = find_address(address, string_)
    for find in finds:
        print()
        print('地址', find)
        print('补全地址', supplement_address(address, find))
        print('纠错地址', correct_address(address, find))
        print('--------------------------')


# 地址 红花岗
# 补全地址 ['贵州省-遵义市-红花岗区']
# 纠错地址 贵州省-遵义市-红花岗区
# --------------------------
# 
# 地址 贵州贵阳花溪区
# 补全地址 ['贵州省-贵阳市-花溪区']
# 纠错地址 贵州省-贵阳市-花溪区
# --------------------------
# 
# 地址 贵州省遵义市花溪区            注:这个地址是错误的
# 补全地址 []                      注:错误的地址无法补全
# 纠错地址 贵州省-贵阳市-花溪区      注:错误的地址被纠正为对的地址
# --------------------------


if __name__ == '__main__':
    start = time.time()
    all_test()
    print(time.time() - start)  # 0.0002001047134399414秒

查询地址

from pyunit_address import Address, find_address


def test():
    address = Address(is_max_address=True)

    # 添加词库,可以是一个字符串、可以是列表字符串、可以是词库文件,一个词语占一行
    address.add_vague_text('红花岗')  # 在默认词库上追加地址词库
    address.add_vague_text('贵州省-遵义市-遵义县-虾子镇-乐安村')  # 添加补全地址
    address.add_vague_text(['花溪', '贵州省-遵义市-遵义县-虾子镇-乐安村'])  # 加载词库列表,替换默认词库
    address.add_vague_text('自定义词库.txt')  # 加载词库文件,替换默认词库
    af = find_address(address, '我家在贵州遵义红花岗区')
    print(af)


if __name__ == '__main__':
    test()

自动补全地址:输入一句话

from pyunit_address import Address, supplement_address


def test_supplement_address():
    address = Address(is_max_address=True)
    asu = supplement_address(address, '我家在遵义县')  # [贵州省-遵义市-遵义县]
    print(asu)


if __name__ == '__main__':
    test_supplement_address()

自动纠正地址

from pyunit_address import Address, correct_address


def correct_address_test():
    address = Address(is_max_address=True)
    print(correct_address(address, '贵州省遵义市花溪区'))  # 贵州省-贵阳市-花溪区


if __name__ == '__main__':
    correct_address_test()

Docker部署

docker pull jtyoui/pyunit-address
docker run -d -P pyunit-time

Swagger在线api文档

http://localhost:xxx/docs

寻找地址的请求参数

参数名 类型 是否可以为空 说明
data string YES 输入一句带有地址的句子

请求示例

Python3 Requests测试

import requests

url = "http://127.0.0.1:2312/pyunit/address/find"
data = {
    'data': '我家在贵州龙里'
}
response = requests.get(url, params=data).json()
print(response)

返回结果

{
  "code": 200,
  "result": [
    {
      "address": "龙里",
      "correct_address": "贵州省-黔南布依族苗族自治州-龙里县",
      "supplement_address": [
        {
          "key": "贵州省-黔南布依族苗族自治州-龙里县"
        }
      ],
      "type": "区县"
    }
  ]
}

增加地址词库请求参数

参数名 类型 是否可以为空 说明
data string YES 输入一句带有地址的句子

请求示例

Python3 Requests测试

import json
import requests

url = "http://127.0.0.1:2312/pyunit/address/add"
data = {
    'data': json.dumps(['贵州省-贵阳市-观山湖区-观山湖公园', '金融大街', '小吃城'])
}
response = requests.get(url, params=data).json()
print(response)

删除地址词库请求参数

参数名 类型 是否可以为空 说明
data string YES 输入一句带有地址的句子

请求示例

Python3 Requests测试

import json

import requests

url = "http://127.0.0.1:2312/pyunit/address/del"
data = {
    'data': json.dumps(['金融大街', '小吃城']),
}
response = requests.get(url, params=data).json()
print(response)

返回结果

{
  "code": 200,
  "result": "del success"
}

TODO

  • 自动寻找最长地址长度
  • 全国五级地址新词库
  • 支持自定义地址词库
  • 不支持非规则地址
  • 支持地址自动补全
  • 支持快速高效搜索
  • 支持纠错地址

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyunit_address-2021.3.31.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

pyunit_address-2021.3.31-py3-none-any.whl (3.0 MB view details)

Uploaded Python 3

File details

Details for the file pyunit_address-2021.3.31.tar.gz.

File metadata

  • Download URL: pyunit_address-2021.3.31.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for pyunit_address-2021.3.31.tar.gz
Algorithm Hash digest
SHA256 48406cb955710e8cd07a50436621c99fef39db8c41e663ef0477e926dc1a6659
MD5 a7a802b45d174a4d644b89ea26dd10a3
BLAKE2b-256 9e71dee4cd2f3cc1b19a782dffe3b8b1ed817f60d939b7cd0fd113e4d894a38a

See more details on using hashes here.

File details

Details for the file pyunit_address-2021.3.31-py3-none-any.whl.

File metadata

  • Download URL: pyunit_address-2021.3.31-py3-none-any.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for pyunit_address-2021.3.31-py3-none-any.whl
Algorithm Hash digest
SHA256 70102dff9880c89ee2385f8740b2fb4cf0a46f3cf2c3729fcd0ebcf2f5343f36
MD5 aa704af5cdae7f682da3b31ca6e6fd9e
BLAKE2b-256 62efd367c56252d44d774a4992395d024af617db0b69fc4a973bb6a5f068e269

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page