Skip to main content

Python library to decide if two polish words rhyme

Project description

czyrym

A Python library to detect if two Polish words rhyme. The library has been tested on a corpus of Polish rhymes from czterycztery.pl.

Performance

The corpus contains 8100 rhymes - czyrym successfully detects 7387 of them. The rate of false positives is quite low: from 37177 non-rhymes in the corpus, czyrym incorrectly detects only 239 of them as rhymes (and for some of them it can be argued that they really are rhymes, just quite imperfect).

When czyrym detects a pair of words as rhymes, it gives a distance between the two words as a number between 0 (a perfect rhyme) and a very big number (hardly a rhyme at all) - so if you want czyrym to be more conservative, you can compare this distance with your threshold.

Usage Example

import czyrym
from typing import Optional

first: str = czyrym.normalize_word('Róż, ty???')
second: str = czyrym.normalize_word('ruszty')

# if you just want to know if two words rhyme or not, do:
if czyrym.is_rhyme(first, second):
    print(f'"{first}" and "{second}" rhyme')
else:
    print(f'"{first}" and "{second}" do not rhyme')

# if you want to know if two words rhyme or not and what is the distance between them, do:
match: Optional[czyrym.RhymeMatch] = czyrym.find_rhyme_match(first, second)
if match is None:
    print(f'"{first}" and "{second}" don\'t rhyme')
else:
    print(f'A match for "{first}" and "{second}" was found - they rhyme. Total cost of the rhyme (a distance between words) is {match.total_cost}.')

# if you want to understand how it was decided that two words rhyme, do:
if match is None:
    print(f'a match for "{first}" and "{second}" was not found - they don\'t rhyme')
else:
    print(f'A match for "{first}" and "{second}" was found - they rhyme. Their common suffix is {match.common_form}.')
    for word, path in (first, match.first_path), (second, match.second_path):
        if len(path.steps) <= 1:
            print(f'For "{word}" the suffix was not mutated.')
        else:
            print(f'For "{word}" the suffix was produced with these steps:')
            for step in path.steps:
                print(f'  "{step.before}" -> "{step.after}" (mutator: {step.mutator_name}, cost: {step.cost})')
            print(f'Total cost of these steps was {path.cost}.')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

czyrym-1.0.3.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

czyrym-1.0.3-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file czyrym-1.0.3.tar.gz.

File metadata

  • Download URL: czyrym-1.0.3.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.8.10

File hashes

Hashes for czyrym-1.0.3.tar.gz
Algorithm Hash digest
SHA256 829853b5d5d7b2a2a7ed1ca15babb05daf049b18edbe3c5d1f3fbc96bd3361d8
MD5 2a4c8f7b564838537c0d8b47d2b0a664
BLAKE2b-256 4b0452b623d6b81056527bd2beeb4543d6e9f58b00dedde6dd3c0b37023347b1

See more details on using hashes here.

File details

Details for the file czyrym-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: czyrym-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.8.10

File hashes

Hashes for czyrym-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7a4d243942abb99c95f270d938a3cfde459dfbbe8d7e3098ac15b1f795336921
MD5 3f18fa23becb3468d778b141b6a0d972
BLAKE2b-256 ff1678cb135e16f91f65d0fa40c476c352d2f65d562d2669b08b6f26e38c1d5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page