Skip to main content

Extract and count countries and cities (+their synonyms) from text

Project description

Build Status Coverage

Toponym

Build grammatical cases for words in Slavic languages from pre-defined recipes.

documentation: https://toponym.iwpnd.pw/

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Installing

for usage:

pip install toponym

for development:

git clone https://github.com/iwpnd/toponym.git
pip install flit
flit install toponym --symlink

Description

Problem

In Slavic languages a word can change, depending on how and where it is used within a sentence. The city Moscow (Москва) changes to Москве when used prepositional. So when you want to eg. know if:

"Москва" in "В Москве с начала года отремонтировали 3 тысячи подъездов"

>> False

Solution

This is where Toponym comes in. Utilizing pre-defined recipes it naively creates grammatical cases depending on the ending of the input word that the user wants to create Toponyms from. The recipe looks as follows:

Recipe

recipe = {
    "а": { # ending of the input-word
        "nominative": [[""], 0],
        "genitive": [ # case that we need
            ["ы","и"], # ending of the output-word
            1 # chars to be deleted, before ending of output is added
            ],
        "dative": [["е"], 1],
        "accusative": [["у"], 1],
        "instrumental": [...]
}

If multiple endings are given, multiple toponyms with that ending will be created. Some of those created toponyms do not make sense, or are not used in the wild. If you have an idea about how to remove those that are unreal please contact me.

With the built toponyms for you can now check:

from toponym.recipes import Recipes
from toponym.toponym import Toponym

recipes_russian = Recipes()
recipes_russian.load_from_language(language='russian')

city = "Москва"

t = Toponym(input_word=city, recipes=recipes_russian)
t.build()

print(t.list_toponyms())
>> ['Москвой', 'Москвы', 'Москви', 'Москве', 'Москву', 'Москва']

any([word in "В Москве с начала года отремонтировали 3 тысячи подъездов" for word in tn.list_toponyms()])
>> True

supported languages:

full name		iso code
croatian		hr
russian		    ru
ukrainian		uk
romanian		ro
latvian		    lv
hungarian		hu
greek		    el
polish		    pl

Running the tests

pytest toponym/tests/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toponym-0.5.1.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

toponym-0.5.1-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file toponym-0.5.1.tar.gz.

File metadata

  • Download URL: toponym-0.5.1.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.23.0

File hashes

Hashes for toponym-0.5.1.tar.gz
Algorithm Hash digest
SHA256 cb2bb6e18b6c4c7621b37d109632dfd67523a916b87b9dbe6d83762dc9870d20
MD5 74d80dd64801fbdeb892552d78bb8488
BLAKE2b-256 a07f1a93e7de33a296c3d079003f95ad0545b98286c6fae564f3789ae4929f3a

See more details on using hashes here.

File details

Details for the file toponym-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: toponym-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.23.0

File hashes

Hashes for toponym-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 880cd2de807d976c38e5be7811a92020f7bca5aed053c6aa244fb0e506c18e16
MD5 4326c5a8157bbc3e7a6e5fd7d904008a
BLAKE2b-256 faedc25407d77f461ad1e59ca0d18f83e2e4bada2f33fe47b0195603772a8afd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page