Skip to main content

A Python library for normalizing Dhivehi text and converting numbers to Dhivehi text format, supporting written, spoken and year forms

Project description

dv-normalize

A Python library for normalizing Dhivehi text by converting numbers to Dhivehi and standardizing sentence endings.

Features

  • Converts numbers to Dhivehi text (both written and spoken forms)
  • Handles years (when followed by ވަނަ)
  • Handles decimal numbers
  • Normalizes formal sentence endings to colloquial form
  • Preserves proper spacing and punctuation

Installation

pip install dv-normalize

Usage

There are two main functions in this library:

  1. int_to_dv - This function converts numbers to Dhivehi text in written form.
  2. spoken_dv - This function converts dhivehi text to spoken form.

Written form

## test case for int_to_dv

from dv_normalize.dv_num import int_to_dv

def main():
    while True:
        try:
            num = input("Enter a number (0 to trillion) or 'q' to exit: ")
            if num.lower() == 'q':
                break
                
            num = int(num)
            if num < 0:
                print("Please enter a non-negative number")
                continue
                
            print(f"{num:,} in Dhivehi:")
            written = int_to_dv(num, is_spoken=False)
            spoken = int_to_dv(num, is_spoken=True)
            year = "Not a valid year format" if num < 1000 or num > 9999 else int_to_dv(num, is_year=True)
            
            print(f"Written form: {written}")
            print(f"Spoken form: {spoken}")
            print(f"Year form: {year}")
            
        except ValueError:
            print("Please enter a valid number")

if __name__ == "__main__":
    main()

Spoken form

from dv_normalize.dv_sentence import spoken_dv

# Test cases
test_cases = [
    "މިއަދު ވަރަށް ފިނިވެއެވެ.",  # Verb ending
    "މިއީ ރީތި ފޮތެކެވެ.",        # Noun ending
    "އޭނާ ދަނީ ސްކޫލަށެވެ.",      # Direction ending
    "1955 މީހުން ތިބެއެވެ.",      # Number with ending
    "2024 ވަނަ އަހަރު",            # Year
    "12.5 ރުފިޔާ",                # Decimal
    "1000 މީހުން",                # Regular number
    "މިއީ ރީތި ފޮތެކެވެ.",          # Sentence ending
    "އޭނާ ގެއަށެވެ.",              # Sentence ending
    "ހާއްސަ އެއްބަސްވުމުގެ ދަށުން އިންޑިއާއިން ރާއްޖެއަށް ވިއްކާ ހަކުރު އޮޅުވާލައިގެން ލަންކާއަށް!", # test sentence
    "އެ އިދާރާއިން ބަލަމުން އަންނަނީ މިދިޔަ މަހުގެ 25 ގައި އެގައުމުން ބޭރު ކުރި 64 ހާސް ޓަނުގެ ހަކުރުގެ ޝިޕްމެންޓެއްގެ މައްސަލަ އެވެ. އެ ޝިޕްމެންޓް އެގައުމުން ބޭރުކުރީ ރާއްޖެ އާއި އިންޑިއާ އާ ދެމެދު ވެފައިވާ ވިޔަފާރީގެ ހާއްސަ އެއްބަސްވުމުގެ ދަށުން ކަނޑައަޅާފައިވާ އަގުތަކުގައި ނަމަވެސް، އެއިން ބައެއް ލަންކާއަށް އެތެރެކުރިން ފަޅާއަރާފައިވާ ކަމަށް އިންޑިއާގެ ބައެއް ނޫސްތަކުގައި ރިފޯޓުކޮށްފައިވެ އެވެ." # test long sentence
]

for test in test_cases:
    print(f"Original: {test}")
    print(f"Normalized: {spoken_dv(test)}\n")

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dv-normalizer-0.1.0.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dv_normalizer-0.1.0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file dv-normalizer-0.1.0.tar.gz.

File metadata

  • Download URL: dv-normalizer-0.1.0.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for dv-normalizer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a37d49c0e2349d50e8e5b0b9b27b978ee37f5962d43ff2e83ab9da0598fcdab4
MD5 20c57895d114427c34578b52da0e5d0a
BLAKE2b-256 d82a3b4a1e2f1545a98d011f6fba98c3c1d0240a6ec1eaee46b22dc194bd1493

See more details on using hashes here.

File details

Details for the file dv_normalizer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dv_normalizer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for dv_normalizer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 89967ad23b9803b910ae49c271d3321246c1cc0c5e340e8ee500f8be4d2c0c57
MD5 b94380dea14dcfeae0d3216cafaefa2d
BLAKE2b-256 ea7e3a0462c3ea7c024f17325142e6ceb805be399653ebf55d88b57023c9d3c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page