A Python implementation of Japanese-address geocoder.
Project description
jageocoder - A Python Japanese geocoder
日本語版は README_ja.md をお読みください。
This is a Python port of the Japanese-address geocoder used in CSIS at the University of Tokyo's "Address Matching Service" and GSI Maps.
Getting Started
This package provides address-geocoding functionality for Python programs. The basic usage is to specify a dictionary with init()
then call search()
to get geocoding results.
python
>>> import jageocoder
>>> jageocoder.init()
>>> jageocoder.search('新宿区西新宿2-8-1')
{'matched': '新宿区西新宿2-8-', 'candidates': [{'id': 5961406, 'name': '8番', 'x': 139.691778, 'y': 35.689627, 'level': 7, 'note': None, 'fullname': ['東京都', '新宿区', '西新宿', '二丁目', '8番']}]}
How to install
Prerequisites
Requires Python 3.6.x or later.
The following packages will be installed automatically.
- marisa-trie for building and retrieving TRIE index
- SQLAlchemy for abstracting access to the RDBMS
Install instructions
- Install the package with
pip install jageocoder
- Install the dictionary with
install-dictionary
command
pip install jageocoder
python -m jageocoder install-dictionary
The dictionary database will be created under
{sys.prefix}/jageocoder/db/
, or if the user doesn't have
write permission there {site.USER_DATA}/jageocoder/db/
by default.
If you need to know the location of the directory containing
the dictionary database, perform get-db-dir
command as follows,
or call jageocoder.get_db_dir()
in your script.
python -m jageocoder get-db-dir
If you prefer to create it in another location, set the environment
variable JAGEOCODER_DB_DIR
before executing install_dictionary()
to specify the directory.
export JAGEOCODER_DB_DIR='/usr/local/share/jageocoder/db'
python -m install-dictionary
Update dictinary
The install-dictionary
command will download and install
a version of the address dictionary file that is compatible with
the currently installed jageocoder package.
If you upgrade the jageocoder package after installing the address dictionary file, it may no longer be compatible with the installed address dictionary file. In which case you will need to reinstall or update the dictionary.
To update the dictionary, run the upgrade-dictionary
command.
This process may take a long time.
python -m upgrade-dictionary
Uninstall instructions
Remove the directory containing the database, or perform
uninstall-dictionary
command as follows.
python -m jageocoder uninstall-dictionary
Then, uninstall the package with pip
command.
pip uninstall jageocoder
For developers
Running the unittests
python -m unittest
tests.test_search
tests for some special address notations.
- Street address in Sapporo city such as '北3西1' for '北三条西一丁目'
- Toorina in Kyoto city such as '下立売通新町西入薮ノ内町' for '薮ノ内町'
Create your own dictionary
Please use the dictionary coverter jageocoder-converter.
Sample Web Application
A sample of a simple web app using Flask is available under
flask-demo
.
Perform the following steps. Then, access port 5000.
cd flask-demo
pip install flask
bash run.sh
ToDos
-
Supporting address changes
The functionality to handle address changes due to municipal consolidation, etc. has already been implemented in the C++ version, but will be implemented in this package in the future.
Contributing
Address notation varies. So suggestions for logic improvements are welcome. Please submit an issue with examples of address notations in use and how they should be parsed.
Authors
- Takeshi SAGARA - Info-proto Co.,Ltd.
License
This project is licensed under the MIT License.
This is not the scope of the dictionary data license. Please follow the license of the respective dictionary data.
Acknowledgements
We would like to thank CSIS for allowing us to provide address matching services on their institutional website for over 20 years.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.