Extract countries, regions and cities from a URL or text
Project description
Extract place names from a URL or text, and add context to those names – for example distinguishing between a country, region or city.
## Install & Setup
Grab the package using pip (this will take a few minutes)
pip install geograpy2
Geograpy2 uses [NLTK](http://www.nltk.org/) for entity recognition, so you’ll also need to download the models we’re using. Fortunately there’s a command that’ll take care of this for you.
geograpy-nltk
## Basic Usage
Import the module, give some text or a URL, and presto.
import geograpy2 url = ‘http://www.bbc.com/news/world-europe-26919928’ places = geograpy2.get_place_context(url=url)
## Credits
Geograpy2 is a fork of [geograpy](https://github.com/ushahidi/geograpy) and inherits most of it, but solves several problems (such as support for utf8, places names with multiple words, confusion over homonyms etc).
Geograpy2 uses the following excellent libraries:
[NLTK](http://www.nltk.org/) for entity recognition
[newspaper](https://github.com/codelucas/newspaper) for text extraction from HTML
[jellyfish](https://github.com/sunlightlabs/jellyfish) for fuzzy text match
[pycountry](https://pypi.python.org/pypi/pycountry) for country/region lookups
Geograpy uses the following data sources:
[GeoLite2](http://dev.maxmind.com/geoip/geoip2/geolite2/) for city lookups
[ISO3166ErrorDictionary](https://github.com/bodacea/countryname/blob/master/countryname/databases/ISO3166ErrorDictionary.csv) for common country mispellings _via [Sara-Jayne Terp](https://github.com/bodacea)_
Hat tip to [Chris Albon](https://github.com/chrisalbon) for the name.
Released under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file geograpy2-0.1.0.tar.gz
.
File metadata
- Download URL: geograpy2-0.1.0.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b065e6c67521d9ff61ad92cc9ea88097429073d41a798f774ee35be08b9be9f |
|
MD5 | b35d513275356ddf19012e38023c700f |
|
BLAKE2b-256 | cde40121fa548fb5c85cca76e05cebd6a06e2c5ba496f0165b22b516c970c7e5 |