Extract and count countries and cities (+their synonyms) from text
Project description
flashgeotext :zap::earth_africa:
Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.
introductory blogpost: https://iwpnd.github.io/articles/2020-02/flashgeotext-library
Usage
from flashgeotext.geotext import GeoText
geotext = GeoText()
input_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans
to cut tariffs on $75 billion worth of goods that the country
imports from the US. Washington welcomes the decision.'''
geotext.extract(input_text=input_text)
>> {
'cities': {
'Shanghai': {
'count': 2,
'span_info': [(0, 8), (45, 53)],
'found_as': ['Shanghai', 'Shanghai'],
},
'Washington, D.C.': {
'count': 1,
'span_info': [(175, 185)],
'found_as': ['Washington'],
}
},
'countries': {
'China': {
'count': 1,
'span_info': [(64, 69)],
'found_as': ['China'],
},
'United States': {
'count': 1,
'span_info': [(171, 173)],
'found_as': ['US'],
}
}
}
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Installing
pip:
pip install flashgeotext
conda:
conda install flashgeotext
for development:
git clone https://github.com/iwpnd/flashgeotext.git
cd flashgeotext/
poetry install
Running the tests
poetry run pytest . -v
Authors
- Benjamin Ramser - Initial work - iwpnd
See also the list of contributors who participated in this project.
License
This project is licensed under the MIT License - see the LICENSE.md file for details
Demo Data cities from http://www.geonames.org licensed under the Creative Commons Attribution 3.0 License.
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flashgeotext-0.5.5.tar.gz.
File metadata
- Download URL: flashgeotext-0.5.5.tar.gz
- Upload date:
- Size: 439.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6ee0082718304734024b61ca9287a181d8fd8962d463c977453fdf11ee9f2c4
|
|
| MD5 |
7d80a626bbf511ca949a386c941d0a1f
|
|
| BLAKE2b-256 |
ffc58489239faf17900c77e3e1274c5205b0ecfeca895d01b162abd8c4c7a3b9
|
Provenance
The following attestation bundles were made for flashgeotext-0.5.5.tar.gz:
Publisher:
build-test-release.yaml on iwpnd/flashgeotext
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flashgeotext-0.5.5.tar.gz -
Subject digest:
b6ee0082718304734024b61ca9287a181d8fd8962d463c977453fdf11ee9f2c4 - Sigstore transparency entry: 164015186
- Sigstore integration time:
-
Permalink:
iwpnd/flashgeotext@4437ee436f3b9096a7395ae143aeba374dc61570 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/iwpnd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-test-release.yaml@4437ee436f3b9096a7395ae143aeba374dc61570 -
Trigger Event:
push
-
Statement type:
File details
Details for the file flashgeotext-0.5.5-py3-none-any.whl.
File metadata
- Download URL: flashgeotext-0.5.5-py3-none-any.whl
- Upload date:
- Size: 447.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50a473c8d4f5e707f9612fca841244837d7bd3c0a1bd375e42daa5d9f6faf8f3
|
|
| MD5 |
18de8a6829721405775e4cd368382495
|
|
| BLAKE2b-256 |
60c20ac1c87043821e2daaf2daaf5ef85ae61f69caa94fd36bbff894b1e92abc
|
Provenance
The following attestation bundles were made for flashgeotext-0.5.5-py3-none-any.whl:
Publisher:
build-test-release.yaml on iwpnd/flashgeotext
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flashgeotext-0.5.5-py3-none-any.whl -
Subject digest:
50a473c8d4f5e707f9612fca841244837d7bd3c0a1bd375e42daa5d9f6faf8f3 - Sigstore transparency entry: 164015187
- Sigstore integration time:
-
Permalink:
iwpnd/flashgeotext@4437ee436f3b9096a7395ae143aeba374dc61570 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/iwpnd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-test-release.yaml@4437ee436f3b9096a7395ae143aeba374dc61570 -
Trigger Event:
push
-
Statement type: