Skip to main content

A Japanese-address geocoder for Python.

Project description

Jageocoder - A Python Japanese geocoder

日本語版は README_ja.md をお読みください。

This is a Python port of the Japanese-address geocoder DAMS used in CSIS at the University of Tokyo's "Address Matching Service" and GSI Maps.

Getting Started

This package provides address-geocoding and reverse-geocoding functionality for Python programs. The basic usage is to specify a dictionary with init() then call search() to get geocoding results.

>>> import jageocoder
>>> jageocoder.init(url='https://jageocoder.info-proto.com/jsonrpc')
>>> jageocoder.search('新宿区西新宿2-8-1')
{'matched': '新宿区西新宿2-8-', 'candidates': [{'id': 5961406, 'name': '8番', 'x': 139.691778, 'y': 35.689627, 'level': 7, 'note': None, 'fullname': ['東京都', '新宿区', '西新宿', '二丁目', '8番']}]}

How to install

Prerequisites

Requires Python 3.9.2 or later.

All other required packages will be installed automatically.

Install instructions

  • Install the package with pip install jageocoder

To use Jageocoder, you need to install the "Dictionary Database" on the same machine or connect to the RPC service provided by jageocoder-server .

Install Dictionary Database

When a dictionary database is installed, large amounts of data can be processed at high speed. A database covering addresses in Japan requires 20 GB or more of storage.

  • Download an address database file compatible with that version from here

    jageocoder download-dictionary https://www.info-proto.com/static/jageocoder/20250423/v2/jukyo_all_20250423_v22.zip 
    
  • Install the dictionary with install-dictionary command

    jageocoder install-dictionary jukyo_all_20250423_v22.zip
    

If you need to know the location of the dictionary directory, perform get-db-dir command as follows. (Or call jageocoder.get_db_dir() in your script)

jageocoder get-db-dir

If you prefer to create the database in another location, set the environment variable JAGEOCODER_DB2_DIR before executing install_dictionary to specify the directory.

export JAGEOCODER_DB2_DIR='/usr/local/share/jageocoder/db2'
install-dictionary <db-file>

Connect to the Jageocoder server

Since dictionary databases are large in size, installing them on multiple machines consumes storage and requires time and effort to update them. Instead of installing a dictionary database on each machine, you can connect to a Jageocoder server to perform the search process.

If you want to use a server, specify the server endpoint in the environment variable JAGEOCODER_SERVER_URL. For a public demonstration server, use the following

export JAGEOCODER_SERVER_URL=https://jageocoder.info-proto.com/jsonrpc

However, the server for public demos cannot handle heavy traffic, so we have set a limit on the number of requests per second. If you want to process a large number of requests, please refer to here to set up your own Jageocoder server. The endpoint is '/jsonrpc' on the server.

Uninstall instructions

Remove the directory containing the database, or perform uninstall-dictionary command as follows.

jageocoder uninstall-dictionary

Then, uninstall the package with pip command.

pip uninstall jageocoder

How to use

Use from the command line

Jageocoder is intended to be embedded in applications as a library and used by calling the API, but a simple command line interface is also provided.

For example, to geocode an address, execute the following command.

jageocoder search 新宿区西新宿2-8-1

You can check the list of available commands with --help.

jageocoder --help

Using API

First, import jageocoder and initialize it with init().

>>> import jageocoder
>>> jageocoder.init()

The parameter db_dir of init() can be used to specify the directory where the address database is installed. Alternatively, you can specify the endpoint URL of the Jageocoder server with url. If it is omitted, the value of the environment variable is used.

Search for latitude and longitude by address

Use search() to search for the address you want to check the longitude and latitude of.

The search() function returns a dict with matched as the matched string and candidates as the list of search results. (The results are formatted for better viewing)

Each element of candidates contains the information of an address node (AddressNode).

>>> jageocoder.search('新宿区西新宿2-8-1')
{
  'matched': '新宿区西新宿2-8-',
  'candidates': [{
    'id': 12299846, 'name': '8番',
    'x': 139.691778, 'y': 35.689627, 'level': 7, 'note': None,
    'fullname': ['東京都', '新宿区', '西新宿', '二丁目', '8番']
  }]
}

The meaning of the items is as follows

  • id: ID in the database
  • name: Address notation
  • x: longitude
  • y: latitude
  • level: Address level (1:Prefecture, 2:County, 3:City and 23 district, 4:Ward, 5:Oaza, 6:Aza and Chome, 7:Block, 8:Building)
  • note: Notes such as city codes
  • fullname: List of address notations from the prefecture level to this node

Search for addresses by longitude and latitude

You can specify the latitude and longitude of a point and look up the address of that point (so-called reverse geocoding).

When you pass the longitude and latitude of the point you wish to look up to reverse(), you can retrieve up to three address nodes surrounding the specified point.

>>> import jageocoder
>>> jageocoder.init()
>>> triangle = jageocoder.reverse(139.6917, 35.6896, level=7)
>>> if len(triangle) > 0:
...     print(triangle[0]['candidate']['fullname'])
...
['東京都', '新宿区', '西新宿', '二丁目', '8番']

In the example above, the level optional parameter is set to 7 to search down to the block (街区・地番) level.

[!NOTE]

Indexes for reverse geocoding are automatically created the first time you perform reverse geocoding. Note that this process can take a long time.

Explore the attribute information of an address

Use searchNode() to retrieve information about an address.

This function returns a list of type jageocoder.result.Result . You can access the address node from node element of the Result object.

>>> results = jageocoder.searchNode('新宿区西新宿2-8-1')
>>> len(results)
1
>>> results[0].matched
'新宿区西新宿2-8-'
>>> type(results[0].node)
<class 'jageocoder.node.AddressNode'>
>>> node = results[0].node
>>> node.get_fullname()
['東京都', '新宿区', '西新宿', '二丁目', '8番']

Get GeoJSON representation

You can use the as_geojson() method of the Result and AddressNode objects to obtain the GeoJSON representation.

>>> results[0].as_geojson()
{'type': 'Feature', 'geometry': {'type': 'Point', 'coordinates': [139.6917724609375, 35.68962860107422]}, 'properties': {'id': 80223284, 'name': '8番', 'level': 7, 'priority': 3, 'note': '', 'parent_id': 80223179, 'sibling_id': 80223285, 'fullname': ['東京都', '新宿区', '西新宿', '二丁目', '8番'], 'matched': '新宿区西 新宿2-8-'}}
>>> results[0].node.as_geojson()
{'type': 'Feature', 'geometry': {'type': 'Point', 'coordinates': [139.6917724609375, 35.68962860107422]}, 'properties': {'id': 80223284, 'name': '8番', 'level': 7, 'priority': 3, 'note': '', 'parent_id': 80223179, 'sibling_id': 80223285, 'fullname': ['東京都', '新宿区', '西新宿', '二丁目', '8番']}}

Get the local government codes

There are two types of local government codes: JISX0402 (5-digit) and Local Government Code (6-digit).

You can also obtain the prefecture code JISX0401 (2 digits).

>>> node.get_city_jiscode()  # 5-digit code
'13104'
>>> node.get_city_local_authority_code() # 6-digit code
'131041'
>>> node.get_pref_jiscode()  # prefecture code
'13'

Get link URLs to maps

Generate URLs to link to GSI and Google maps.

>>> node.get_gsimap_link()
'https://maps.gsi.go.jp/#16/35.689627/139.691778/'
>>> node.get_googlemap_link()
'https://maps.google.com/maps?q=35.689627,139.691778&z=16'

Traverse the parent node

A "parent node" is a node that represents a level above the address. Get the node by attribute parent.

Now the node points to '8番', so the parent node will be '二丁目'.

>>> parent = node.parent
>>> parent.get_fullname()
['東京都', '新宿区', '西新宿', '二丁目']
>>> parent.x, parent.y
(139.691774, 35.68945)

Traverse the child nodes

A "child node" is a node that represents a level below the address. Get the node by attribute children.

There is always only one parent node, but there can be multiple child nodes. Therefore, children returns a list of address nodes.

Now the parent points to '二丁目', so the child node will be the block number (○番, △番地) contained therein.

>>> type(parent.children)
<class 'list'>
>>> len(parent.children)
50
>>> [child.name for child in parent.children]
['1番', '1番地', '10番', '10番地', '11番', '11番地', '12番地', '134番地', '135番地', '136番地', '139番地', '140番地', '141番地', '145番地', '158番地', '174番地', '178番地', '181番地', '2番', '2番地', '3番', '3番地', '308番地', '309番地', '310番地', '311番地', '313番地', '314番地', '315番地', '318番地', '4番', '4番地', '5番', '5番地', '6番', '6番地', '673番地', '674番地', '7番', '7番地', '705番地', '708番地', '710番地', '733番地', '734番地', '735番地', '8番', '8番地', '9番', '9番地']

For developers

Documentation

Tutorials and references are here.

Create your own dictionary

Consider using jageocoder-converter.

Tests

Run pytest for unit tests, pytest jageocoder/ --doctest-modules for testing sample codes in comments and pytest docs/source/ --doctest-glob=*.rst for testing codes in the online manual document.

Contributing

Address notation varies. So suggestions for logic improvements are welcome. Please submit an issue with examples of address notations in use and how they should be parsed.

Authors

License

This project is licensed under the MIT License.

This is not the scope of the dictionary data license. Please follow the license of the respective dictionary data.

Acknowledgements

We would like to thank CSIS for allowing us to provide address matching services on their institutional website for over 20 years.

We would also like to thank Professor Asanobu Kitamoto of NII for providing us with a large sample of areas using the older address system and for his many help in confirming the results of our analysis.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jageocoder-2.2.0.tar.gz (71.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jageocoder-2.2.0-py3-none-any.whl (78.4 kB view details)

Uploaded Python 3

File details

Details for the file jageocoder-2.2.0.tar.gz.

File metadata

  • Download URL: jageocoder-2.2.0.tar.gz
  • Upload date:
  • Size: 71.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.12 Linux/5.15.0-153-generic

File hashes

Hashes for jageocoder-2.2.0.tar.gz
Algorithm Hash digest
SHA256 ed203bca567bfab86a0d4551089b4e154b92be14a95658bf798e8add61e29810
MD5 4b75441c412425bb24fdad09642a692f
BLAKE2b-256 abd534dbd6d623bf616b21db6f51e0072e8cdb17b4e288aa95a306a8890b9c4c

See more details on using hashes here.

File details

Details for the file jageocoder-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: jageocoder-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 78.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.12 Linux/5.15.0-153-generic

File hashes

Hashes for jageocoder-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 86f909cb25112e35afc14186ac3c085f364d689809d4ea5a91179027c877a3f5
MD5 65f78e04999e2b7251ea3e04cc28f4d6
BLAKE2b-256 c1033126a67d1e08a097e21e584464cc451c48f14e46044a814dee19c7311310

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page