Skip to main content

Easy-to-use module for streamlined parsing of countries from locations

Project description

easy-geoparsing

Easy-to-use module for streamlined parsing of countries from plaintext locations and top-level domains, plus manipulation of country names and ISO 2 & 3 character codes.

Implementation relies on:

installation

To install from the command line via pip, do:

pip install easy-geoparsing

To upgrade to the latest version via pip do:

pip install easy-geoparsing --upgrade

To use via pipenv put the following in your Pipfile:

[packages]
easy-geoparsing = ">=1.0.0"

development

If you've cloned the repository, the best way to make it work is using pipenv

If you don't yet have pipenv, you can use pip to install it from the command line:

pip install pipenv --upgrade

Then, in the top level directory of this repository, easy-geoparsing, do:

pipenv install --dev

This will create the virtual environment and install the requirements (viewable in the Pipfile). The --dev flag will install packages needed for testing etc.

usage

GETTING STARTED

Do the following to get the parser utilities, noting that creating an instance of EasyCountryParser will automatically download the country data payload from RESTcountries and set up all the resources. Speed will therefore depend on your internet connection, but the payload is not large.

from easy_geoparsing import EasyCountryParser

ez_parser = EasyCountryParser()

or, if you don't want to use our alternative names for some of the countries (i.e. you want to exactly follow the RESTcountries standard)

ez_parser = EasyCountryParser(altnames=False)

The EasyCountryParser class provides utilities, based on the data from the RESTcountries API and the GeoText natural-language parser library, for easily extracting and handling country names and codes.

PARSER RESOURCES

The parser is initialised with the following resources:

  • .data - pandas DataFrame containing RESTcountries data
  • .tld_to_a2c - python dict, maps TLDs to 2-character ISO codes
  • .tld_to_a3c - python dict, maps TLDs to 3-character ISO codes
  • .iso2to3 - python dict, maps 2-character ISO codes to 3
  • .iso3to2 - python dict, maps 3-character ISO codes to 2
  • .a2c_map - python dict, maps 2-char ISO codes to full names
  • .a3c_map - python dict, maps 3-char ISO codes to full names

PARSER METHODS

The parser has the following methods for handling locations data:

  • .retrieve_country - parses plaintext for extractable 2-character ISO codes for countries (which can then be manipulated using the mappers above)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easy-geoparsing-1.2.0.tar.gz (5.2 kB view details)

Uploaded Source

File details

Details for the file easy-geoparsing-1.2.0.tar.gz.

File metadata

  • Download URL: easy-geoparsing-1.2.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.20.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for easy-geoparsing-1.2.0.tar.gz
Algorithm Hash digest
SHA256 62c02038768b5c63e76871412e1b05024abf24e27170bf5b37f1e4e6e482536f
MD5 8622b8742863959436f73b2616755b1a
BLAKE2b-256 9a5e67346c202c465281dee31eacf8c321a09da7f9f52e5bfa9028ddbfcbea93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page