Skip to main content

Python library to decide if two polish words rhyme

Project description

czyrym

A Python library to detect if two Polish words rhyme. The library has been tested on a corpus of Polish rhymes from czterycztery.pl.

Performance

The corpus contains 8100 rhymes - czyrym successfully detects 7212 of them. The rate of false positives is quite low: from 37177 non-rhymes in the corpus, czyrym incorrectly detects only 250 of them as rhymes (and for some of them it can be argued that they really are rhymes, just quite imperfect).

When czyrym detects a pair of words as rhymes, it gives a distance between the two words as a number between 0 (a perfect rhyme) and a very big number (hardly a rhyme at all) - so if you want czyrym to be more conservative, you can compare this distance with your threshold.

Usage Example

import czyrym
from typing import Optional

first: str = czyrym.normalize_word('Róż, ty???')
second: str = czyrym.normalize_word('ruszty')

# if you just want to know if two words rhyme or not, do:
if czyrym.is_rhyme(first, second):
    print(f'"{first}" and "{second}" rhyme')
else:
    print(f'"{first}" and "{second}" do not rhyme')

# if you want to know if two words rhyme or not and what is the distance between them, do:
match: Optional[czyrym.RhymeMatch] = czyrym.find_rhyme_match(first, second)
if match is None:
    print(f'"{first}" and "{second}" don\'t rhyme')
else:
    print(f'A match for "{first}" and "{second}" was found - they rhyme. Total cost of the rhyme (a distance between words) is {match.total_cost}.')

# if you want to understand how it was decided that two words rhyme, do:
if match is None:
    print(f'a match for "{first}" and "{second}" was not found - they don\'t rhyme')
else:
    print(f'A match for "{first}" and "{second}" was found - they rhyme. Their common suffix is {match.common_form}.')
    for word, path in (first, match.first_path), (second, match.second_path):
        if len(path.steps) <= 1:
            print(f'For "{word}" the suffix was not mutated.')
        else:
            print(f'For "{word}" the suffix was produced with these steps:')
            for step in path.steps:
                print(f'  "{step.before}" -> "{step.after}" (mutator: {step.mutator_name}, cost: {step.cost})')
            print(f'Total cost of these steps was {path.cost}.')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

czyrym-1.0.2.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

czyrym-1.0.2-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file czyrym-1.0.2.tar.gz.

File metadata

  • Download URL: czyrym-1.0.2.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.8.10

File hashes

Hashes for czyrym-1.0.2.tar.gz
Algorithm Hash digest
SHA256 73c8a6364c8d3bac29f14ffd28e305a3b79a4e4c8b46f1a86478a7f40c4a9a9a
MD5 ddc52af0209079e2120bf037b44246ed
BLAKE2b-256 0499412180cde0d31c2a408eb523df941da417eb3ae9b229c0740de1d9cb7684

See more details on using hashes here.

File details

Details for the file czyrym-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: czyrym-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.8.10

File hashes

Hashes for czyrym-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c310fa3c710dd9932b5137d4b07aed159358c6960535148e7ca4f0904996c479
MD5 83b3e4c689d57732832d0401b016c2ba
BLAKE2b-256 9730a80b1f0eef5ac0d300255542cd12d3af60f070df565576bbbb73e9acb383

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page