Skip to main content

Fuzzy lookup of country information

Project description

countryguess looks up country information by fuzzy name matching. It tries to be lean (but not mean) and fast: All dependencies are in the Python Standard Library and country data is loaded lazily on demand.

Code: Codeberg
Package: PyPI

Usage

guess_country() uses the default country data that is packaged.

>>> from countryguess import guess_country
>>> guess_country("britain")
{
    'name_short': 'United Kingdom',
    'name_official': 'United Kingdom of Great Britain and Northern Ireland',
    'iso2': 'GB',
    'iso3': 'GBR',
    ...
}
>>> guess_country("no such country")
None
>>> guess_country("no such country", default="Oh, well.")
'Oh, well.'
>>> guess_country("PoRtUgAl", attribute="iso2")
'PT'
>>> guess_country("TW", attribute="name_official")  # 2-letter code lookup
'Republic of China'
>>> guess_country("tWn", attribute="name_short")    # 3-letter code lookup
'Taiwan'

You can also create a CountryData instance yourself to provide your own country data.

>>> from countryguess import CountryData
>>> countries = CountryData("path/to/countries.json")
>>> countries["vIeTnAm"]
{'name_short': 'Vietnam', ...}
>>> countries["vn"]
{'name_short': 'Vietnam', ...}
>>> countries["asdf"]
KeyError: 'asdf'
>>> countries.get("asdf")
None
>>> countries.get("kuwait")
{'name_short': 'Kuwait', ...}

On CountryData instances, every key in the JSON data is accessible as a method.

>>> countries.name_official("portugal")
'Portuguese Republic'
>>> countries.continent("vanuatu")
'Oceania'

Country Lookup

Countries are identified by name, 2-letter code (ISO 3166-1 alpha-2) or 3-letter code (ISO 3166-1 alpha-3). All identifiers are matched case-insensitively.

Names are matched with regular expressions that are stored in the JSON data. If that fails, fuzzy matching against name_short and name_official is done with difflib.

Country Data

Country information is read from a JSON file. One is shipped with the package, but you can also provide your own to the CountryData class as described above. The information in the default file was gratefully extracted from country-converter. (Many thanks!)

The country data file must contain a list of JSON objects. Each object represents a country that must contain at least the following keys:

  • name_short
  • name_official
  • iso2
  • iso3
  • regex

Packaged Classification Schemes

The following classification schemes are available in the included country data.

  1. ISO2 (ISO 3166-1 alpha-2)
  2. ISO3 (ISO 3166-1 alpha-3)
  3. ISO - numeric (ISO 3166-1 numeric)
  4. UN numeric code (M.49 - follows to a large extend ISO-numeric)
  5. A standard or short name
  6. The "official" name
  7. Continent
  8. UN region
  9. EXIOBASE 1 classification
  10. EXIOBASE 2 classification
  11. EXIOBASE 3 classification
  12. WIOD classification
  13. Eora
  14. OECD membership (per year)
  15. MESSAGE 11-region classification
  16. IMAGE
  17. REMIND
  18. UN membership (per year)
  19. EU membership (including EU12, EU15, EU25, EU27, EU27_2007, EU28)
  20. EEA membership
  21. Schengen region
  22. Cecilia 2050 classification
  23. APEC
  24. BRIC
  25. BASIC
  26. CIS (as by 2019, excl. Turkmenistan)
  27. G7
  28. G20 (listing all EU member states as individual members)
  29. FAOcode (numeric)
  30. GBDcode (numeric - Global Burden of Disease country codes)
  31. IEA (World Energy Balances 2021)
  32. DACcode (numeric - OECD Development Assistance Committee)
  33. ccTLD - country code top-level domains
  34. GWcode - Gledisch & Ward numerical codes as published in https://www.andybeger.com/states/articles/statelists.html

Command Line Interface

countryguess comes with a simple CLI with the same name. It takes one or two arguments:

$ countryguess oman
{
    "name_short": "Oman",
    "name_official": "Sultanate of Oman",
    ...
}
$ countryguess 'puerto ricco' name_official
Puerto Rico

Contributing

All kinds of bug reports, feature requests and suggestions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

countryguess-0.2.1.tar.gz (43.4 kB view hashes)

Uploaded Source

Built Distribution

countryguess-0.2.1-py3-none-any.whl (42.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page