Skip to main content

Extract and count countries and cities (+their synonyms) from text

Project description

Build Status Coverage


flashgeotext :zap::earth_africa:

Extract and count countries and cities (+their synonyms) from text, like GeoText on steroids using FlashText, a Aho-Corasick implementation. Flashgeotext is a fast, batteries-included (and BYOD) and native python library that extracts one or more sets of given city and country names (+ synonyms) from an input text.

introductory blogpost: https://iwpnd.github.io/articles/2020-02/flashgeotext-library

Usage

from flashgeotext.geotext import GeoText

geotext = GeoText()

input_text = '''Shanghai. The Chinese Ministry of Finance in Shanghai said that China plans
                to cut tariffs on $75 billion worth of goods that the country
                imports from the US. Washington welcomes the decision.'''

geotext.extract(input_text=input_text)
>> {
    'cities': {
        'Shanghai': {
            'count': 2,
            'span_info': [(0, 8), (45, 53)],
            'found_as': ['Shanghai', 'Shanghai'],
            },
        'Washington, D.C.': {
            'count': 1,
            'span_info': [(175, 185)],
            'found_as': ['Washington'],
            }
        },
    'countries': {
        'China': {
            'count': 1,
            'span_info': [(64, 69)],
            'found_as': ['China'],
            },
        'United States': {
            'count': 1,
            'span_info': [(171, 173)],
            'found_as': ['US'],
            }
        }
    }

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Installing

pip:

pip install flashgeotext

conda:

conda install flashgeotext

for development:

git clone https://github.com/iwpnd/flashgeotext.git
cd flashgeotext/
poetry install

Running the tests

poetry run pytest . -v

Authors

  • Benjamin Ramser - Initial work - iwpnd

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Demo Data cities from http://www.geonames.org licensed under the Creative Commons Attribution 3.0 License.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flashgeotext-0.5.3.tar.gz (439.4 kB view details)

Uploaded Source

Built Distribution

flashgeotext-0.5.3-py3-none-any.whl (448.0 kB view details)

Uploaded Python 3

File details

Details for the file flashgeotext-0.5.3.tar.gz.

File metadata

  • Download URL: flashgeotext-0.5.3.tar.gz
  • Upload date:
  • Size: 439.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for flashgeotext-0.5.3.tar.gz
Algorithm Hash digest
SHA256 4ffa3bdea2b826cd61da66cf8a71f393b83d1229ec3943e3a1109c5e6cb3d6a3
MD5 7fdcced41be9edeb48ba3c2f8c88d43e
BLAKE2b-256 47deaa769e5dd8945c672c69686c6f82ef6a44f7b2338ab35e896bfac61cfb12

See more details on using hashes here.

File details

Details for the file flashgeotext-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: flashgeotext-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 448.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for flashgeotext-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 43e73bdd304689e243ae8c9852c04854f2455f268fa138111aa26fbffc03bcae
MD5 e43f73ffc1b5417546f8d35087f7c2c3
BLAKE2b-256 dd441d86edd4a6c5835f958cedb51a9fab7f256d8d24a15a6754ff2d4d93b0ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page