World Guess is a package to identify subject countries in documents
This python package guess the country of a subject text, name or list based on places names frequencies. It works in any languages/alphabet.
Originally, this library was made to be used with a list of places extracted with an NER program such as Spacy.
I heavely recommend using it that way.
It is also possible to use it on a text, but the precision is not very good, as some words in a language correspond to a place in another language.
It is also still a work in progress. I did a version of this library in an old internship, to quickly identify and classify documents according to countries, and thought it was a cool tool to share, so I remade it from scratch at home recently (with permission of my old boss).
It is an easy way to identify the source country of an news article for example, and automatically tag the country.
With a list:
wg = WorldGuesser() text = ["London", "Manchester", "UK", "BRISTOL", "Scotland", "Berlin"] result = wg.from_list(text) self.assertEqual(result, "United Kingdom")
With a name:
wg = WorldGuesser() text = "санкт-петербург" result = wg.from_place(text) self.assertEqual(result, "Russia")
If no country is found, the first result in the list will be "Unknown"
The date sources come from the GeoNames Database: https://www.geonames.org/
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size worldguess-0.0.1-py3-none-any.whl (3.0 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size worldguess-0.0.1.tar.gz (2.0 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for worldguess-0.0.1-py3-none-any.whl