Easy-to-use module for streamlined parsing of countries from locations
Project description
easy-geoparsing
Easy-to-use module for streamlined parsing of countries from plaintext locations and top-level domains, plus manipulation of country names and ISO 2 & 3 character codes.
Implementation relies on:
- the RESTcountries API
- the geotext module
installation
To install from the command line via pip, do:
pip install easy-geoparsing
To upgrade to the latest version via pip
do:
pip install easy-geoparsing --upgrade
To use via pipenv put the following in your Pipfile:
[packages]
easy-geoparsing = ">=1.0.0"
development
If you've cloned the repository, the best way to make it work is using pipenv
If you don't yet have pipenv
, you can use pip
to install it from the command line:
pip install pipenv --upgrade
Then, in the top level directory of this repository, easy-geoparsing
, do:
pipenv install --dev
This will create the virtual environment and install the requirements (viewable in the Pipfile). The --dev
flag will install packages needed for testing etc.
usage
GETTING STARTED
Do the following to get the parser utilities, noting that creating an instance of EasyCountryParser
will automatically download the country data payload from RESTcountries and set up all the resources. Speed will therefore depend on your internet connection, but the payload is not large.
from easy_geoparsing import EasyCountryParser
ez_parser = EasyCountryParser()
or, if you don't want to use our alternative names for some of the countries (i.e. you want to exactly follow the RESTcountries standard)
ez_parser = EasyCountryParser(altnames=False)
The EasyCountryParser
class provides utilities, based on the data from the RESTcountries API and the GeoText natural-language parser library, for easily extracting and handling country names and codes.
PARSER RESOURCES
The parser is initialised with the following resources:
.data
- pandas DataFrame containing RESTcountries data.tld_to_a2c
- python dict, maps TLDs to 2-character ISO codes.tld_to_a3c
- python dict, maps TLDs to 3-character ISO codes.iso2to3
- python dict, maps 2-character ISO codes to 3.iso3to2
- python dict, maps 3-character ISO codes to 2.a2c_map
- python dict, maps 2-char ISO codes to full names.a3c_map
- python dict, maps 3-char ISO codes to full names
PARSER METHODS
The parser has the following methods for handling locations data:
.retrieve_country
- parses plaintext for extractable 2-character ISO codes for countries (which can then be manipulated using the mappers above)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file easy-geoparsing-1.2.3.tar.gz
.
File metadata
- Download URL: easy-geoparsing-1.2.3.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6cd2888e4f5fb702b57a2c830f62786d2ad9a226cfbed41e03f51f43c0e3b24 |
|
MD5 | e02843462fbb15dfe525eba07b4e9f1e |
|
BLAKE2b-256 | fd87cae74d944baa6b05ded4fc2487149a0f8afade06a930a149670c8d449ea1 |