Extract and count countries and cities (+their synonyms) from text
Project description
flashgeotext :zap::earth_africa:
Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.
introductory blogpost: https://iwpnd.github.io/articles/2020-02/flashgeotext-library
Usage
from flashgeotext.geotext import GeoText
geotext = GeoText()
input_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans
to cut tariffs on $75 billion worth of goods that the country
imports from the US. Washington welcomes the decision.'''
geotext.extract(input_text=input_text)
>> {
'cities': {
'Shanghai': {
'count': 2,
'span_info': [(0, 8), (45, 53)],
'found_as': ['Shanghai', 'Shanghai'],
},
'Washington, D.C.': {
'count': 1,
'span_info': [(175, 185)],
'found_as': ['Washington'],
}
},
'countries': {
'China': {
'count': 1,
'span_info': [(64, 69)],
'found_as': ['China'],
},
'United States': {
'count': 1,
'span_info': [(171, 173)],
'found_as': ['US'],
}
}
}
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Installing
pip:
pip install flashgeotext
conda:
conda install flashgeotext
for development:
git clone https://github.com/iwpnd/flashgeotext.git
cd flashgeotext/
poetry install
Running the tests
poetry run pytest . -v
Authors
- Benjamin Ramser - Initial work - iwpnd
See also the list of contributors who participated in this project.
License
This project is licensed under the MIT License - see the LICENSE.md file for details
Demo Data cities from http://www.geonames.org licensed under the Creative Commons Attribution 3.0 License.
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file flashgeotext-0.5.3.tar.gz
.
File metadata
- Download URL: flashgeotext-0.5.3.tar.gz
- Upload date:
- Size: 439.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ffa3bdea2b826cd61da66cf8a71f393b83d1229ec3943e3a1109c5e6cb3d6a3 |
|
MD5 | 7fdcced41be9edeb48ba3c2f8c88d43e |
|
BLAKE2b-256 | 47deaa769e5dd8945c672c69686c6f82ef6a44f7b2338ab35e896bfac61cfb12 |
File details
Details for the file flashgeotext-0.5.3-py3-none-any.whl
.
File metadata
- Download URL: flashgeotext-0.5.3-py3-none-any.whl
- Upload date:
- Size: 448.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43e73bdd304689e243ae8c9852c04854f2455f268fa138111aa26fbffc03bcae |
|
MD5 | e43f73ffc1b5417546f8d35087f7c2c3 |
|
BLAKE2b-256 | dd441d86edd4a6c5835f958cedb51a9fab7f256d8d24a15a6754ff2d4d93b0ea |