Skip to main content

World Guess is a package to identify subject countries in documents

Project description

worldguess

Summary

This python package guess the country of a subject text, name or list based on places names frequencies. It works in any languages/alphabet.

Warning

Originally, this library was made to be used with a list of places extracted with an NER program such as Spacy.

I heavely recommend using it that way.

It is also possible to use it on a text, but the precision is not very good, as some words in a language correspond to a place in another language.

It is also still a work in progress. I did a version of this library in an old internship, to quickly identify and classify documents according to countries, and thought it was a cool tool to share, so I remade it from scratch at home recently (with permission of my old boss).

It is an easy way to identify the source country of an news article for example, and automatically tag the country.

Usage

With a list:

wg = WorldGuesser()
text = ["London", "Manchester", "UK", "BRISTOL", "Scotland", "Berlin"]
result = wg.from_list(text)
self.assertEqual(result[0], "United Kingdom")

With a name:

wg = WorldGuesser()
text = "санкт-петербург"
result = wg.from_place(text)
self.assertEqual(result[0], "Russia")

If no country is found, the first result in the list will be "Unknown"

Data Sources

The date sources come from the GeoNames Database: https://www.geonames.org/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

worldguess-0.0.1.tar.gz (2.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

worldguess-0.0.1-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file worldguess-0.0.1.tar.gz.

File metadata

  • Download URL: worldguess-0.0.1.tar.gz
  • Upload date:
  • Size: 2.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for worldguess-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2e28e4223c26e980ec259fee8dfbbfcfe06722d940495d8bdf050e3d9b71e6f8
MD5 3a772bf1bccf38181374a4f30b37b599
BLAKE2b-256 407de625c479a92f9bb0b379392e08abfc2c5ecad709999b852c8fc30096fb12

See more details on using hashes here.

File details

Details for the file worldguess-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: worldguess-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for worldguess-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3b79f6972753754a02373ecb19856737580ebec7daec395319708aa63629d634
MD5 8bddcfb30b414797e0ba3dbeb3479600
BLAKE2b-256 ee78075d3bdca64626485eba35a354413caf27cf5d5e2cf62ab19ad6ab623a97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page