Skip to main content

Malayalam phonetic analyser

Project description

PyPI Version

This is python interface for the Malayalam phonetic analyser - mlphon.

Installation

Using Virtual Environment (https://docs.python.org/3/library/venv.html) is recommended.

$ pip install mlphon

Syllablize a Malayalam Word

The following python snippet will split a word in Malayalam script into syllables.

from mlphon import PhoneticAnalyser
mlphon = PhoneticAnalyser()
mlphon.split_to_syllables('കേരളം')

It will give the result

[‘കേ’, ‘ര’, ‘ളം’]

Phonetically analyse a Malayalam Word

from mlphon import PhoneticAnalyser
mlphon = PhoneticAnalyser()
mlphon.analyse('കേരളം')

It gives the result as a sequence of ipa and associated phonetic tags.

[{‘phonemes’: [{‘ipa’: ‘k’, ‘tags’: [‘plosive’, ‘voiceless’, ‘unaspirated’, ‘velar’]}, {‘ipa’: ‘eː’, ‘tags’: [‘v_sign’]}]}, {‘phonemes’: [{‘ipa’: ‘ɾ’, ‘tags’: [‘flapped’, ‘alveolar’]}, {‘ipa’: ‘a’, ‘tags’: [‘inherentvowel’]}]}, {‘phonemes’: [{‘ipa’: ‘ɭ’, ‘tags’: [‘lateral’, ‘retroflex’]}, {‘ipa’: ‘a’, ‘tags’: [‘inherentvowel’]}, {‘ipa’: ‘m’, ‘tags’: [‘anuswara’]}]}]

Malayalam g2p : Grapheme to Phoneme conversion

from mlphon import PhoneticAnalyser
mlphon = PhoneticAnalyser()
mlphon.grapheme_to_phoneme('കാറ്റ്')

It gives the ipa sequence as output.

[‘kaːṯṯə’]

Malayalam p2g : Phoneme to Grapheme conversion

from mlphon import PhoneticAnalyser
mlphon = PhoneticAnalyser()
mlphon.phoneme_to_grapheme('paːlə')

It gives the corresponding grapheme sequences as output. See that it gives two possible sequences, one of which is obsolete.

[പാല്’]

Command Line Interface for the above operations: mlphon

usage:
mlphon [-h] [-s] [-a] [-p] [-g] [-i INFILE] [-o OUTFILE] [-v]

optional arguments:
-h, --help            show this help message and exit
-s, --syllablize      Syllablize the input Malayalam string
-a, --analyse         Phonetically analyse the input Malayalam string
-p, --tophoneme       Transcribe the input Malayalam grapheme to phoneme
-g, --tographeme      Transcribe the input phoneme to Malayalam grapheme
-i INFILE, --input INFILE   source of analysis data
-o OUTFILE, --output OUTFILE    target of generated strings
-v, --verbose         print verbosely while processing
For example to perform g2p operation on a set of words stored in input.txt with one Malayalam word per line,
mlphon -p -i path/to/inputfile.txt -o path/to/outputfile.txt
Inputfile contents:
cat path/to/inputfile.txt
അകത്തുള്ളത്
അകപ്പെട്ടത്
അകലെ
Outputfile contents:
അകത്തുള്ളത് akat̪t̪uɭɭat̪ə
അകപ്പെട്ടത്        akappeʈʈat̪ə
അകലെ    akale

Application: Using mlphon to create a phonetic lexicon

A typical use case of phonetic analysis is to create a phonetic lexicon to be used in Automatic Speech Recognition or Text to Speech Synthesis. The phonetic representation with each phoneme separated by a space can be obtained as below:

from mlphon import PhoneticAnalyser, split_as_phonemes
mlphon = PhoneticAnalyser()
split_as_phonemes(mlphon.analyse('ഇന്ത്യയുടെ'))

It results in the output:

‘i n̪ t̪ j a j u ʈ e’

The phonetic representation with each syllable separated by a space can be obtained as below:

from mlphon import PhoneticAnalyser, split_as_syllables
mlphon = PhoneticAnalyser()
split_as_syllables(mlphon.analyse('ഇന്ത്യയുടെ'))

It results in the output:

‘i n̪t̪ja ju ʈe’

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlphon-3.0.6.tar.gz (17.8 kB view details)

Uploaded Source

File details

Details for the file mlphon-3.0.6.tar.gz.

File metadata

  • Download URL: mlphon-3.0.6.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.9

File hashes

Hashes for mlphon-3.0.6.tar.gz
Algorithm Hash digest
SHA256 e2143742a5c023711d934bef33d10b0b6b1e01e6189267302e3c3a73c11c5e1f
MD5 44022e0ae964df8c78218d62858cfd1d
BLAKE2b-256 3afde64028c57ca48550d4a0a2fd2cbc7c9dfc3f4adf60a004faa0cd8c6ebdec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page