Skip to main content

Phonetics algorithms (Soundex and Metaphone) for russian, english, sweden, finnish and estonian languages

Project description

Fonetika

Russian, English, Sweden, Estonian and Finnish Phonetic algorithm based on Soundex/Metaphone.

Package has both implemented phoneme transformation into letter-number sequence and distance engine for comparison of phonetic sequences (based on Levenstein and Hamming distances).

Furthermore, both Russian phonetic algorithms supports preprocessing for specific phoneme cases.

Quick start

  1. Install this package via pip
pip install fonetika
  1. Import Soundex algorithm.

Package supports a lot of opportunities, it's possible to cut a result sequence (like in the original Soundex version) or also code vowels.

from fonetika.soundex import RussianSoundex

soundex = RussianSoundex(delete_first_letter=True)
soundex.transform('ёлочка')
...

J070530

soundex = RussianSoundex(delete_first_letter=True, code_vowels=True)
soundex.transform('ёлочка')
...

JA7A53A

A structure of the library is scalable, RussianSoundex class inherits basic class Soundex (original for English language). In order to extend our algorithm, you need just inherit own class from Soundex and override methods.

  1. Import Soundex distance for usage of string comparision
from fonetika.distance import PhoneticsInnerLanguageDistance

soundex = RussianSoundex(delete_first_letter=True)
phon_distance = PhoneticsInnerLanguageDistance(soundex)
phon_distance.distance('ёлочка', 'йолочка')
...

0
  1. You can also calculate distance between words of two languages. It would be useful for working with one language family group.
from fonetika.distance import PhoneticsBetweenLanguagesDistance

m1 = FinnishMetaphone(reduce_word=False)
m2 = EstonianMetaphone(reduce_word=False)
phon_distance = PhoneticsBetweenLanguagesDistance(m1, m2)
phon_distance.distance('yö', 'öö')
...

1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fonetika-1.5.1.tar.gz (9.5 kB view details)

Uploaded Source

File details

Details for the file fonetika-1.5.1.tar.gz.

File metadata

  • Download URL: fonetika-1.5.1.tar.gz
  • Upload date:
  • Size: 9.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.22.0 setuptools/59.6.0 requests-toolbelt/0.8.0 tqdm/4.64.0 CPython/3.6.1

File hashes

Hashes for fonetika-1.5.1.tar.gz
Algorithm Hash digest
SHA256 0e1d004611056dde07da41e59a971acc5bc2e919c523188e8e1b17ffea3063df
MD5 63f832be92faa869d35f27b39815c805
BLAKE2b-256 767ec983a54e8f5cc723113e9f68d1f636c625fadd300f003818706e7223350d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page