Skip to main content

This package is to translate spoken languag to its written form

Project description

# Spoken English to Written English translator

There exits a difference between how we write and how we speak. e.g While speaking we say “I paid twenty thousand dollars to xyz organization”. But, we don’t write above example as it is, instead we write it as “I paid $20000 to xyz organization.” This is a python module is to translates such spoken english language to its written form.

e.g. It will translate: “I watched movie named triple H .” to “I watched movie named HHH”
“My weight is fifty five kilograms .” to “My weight is 55 kg” “I paid twenty thousand dollars to xyz organization .” to “I paid $20000 to xyz organization .”

<h1>Installation guide</h1>

Run this command in terminal: ` pip install spoken2written ` The dependencies spaCy,word2number will also be installed after installing the package. It is better to have english language dependency requirement of spacy which is en_core_web_sm

To install this en_core_web_sm, run following command in terminal ` python -m spacy download en_core_web_sm ` <h1>Usage</h1>

First you have to import the module using the below code. ` import spoken2written ` If it shows error during importing then spacy english dependency package is not installed in your device. In this case, install en_core_web_sm library using the command mentioned above.

After importing the package use TextTraslator method to translate spoken English to written form.

Example script: ` >>>from spoken2written import TextTranslator ...test= "My life is triple B . European authorities fined Google a record sixty five thousand dollars on Wednesday for abusing its power in the mobile phone market and ordered the company to alter its practices . Furthermore , My T - Shirt size is double X in 2019 and it costs six dollars . My weight is fifty kilograms ." ...result=TextTranslator(test) ...print(result) ` Output: ` My life is BBB . European authorities fined Google a record $65000 on Wednesday for abusing its power in the mobile phone market and ordered the company to alter its practices . Furthermore , My T - Shirt size is XX in 2019 and it costs $6 . My weight is 50 kg . `

<h1>Features Used to Develop this package</h1>

  1. Name Entity Recognition technique is used to detect entities from given input. Name Entity Recognition is done using the library named ‘spaCy’. Entities such as QUANTITY (E.g weight: fifty kilograms), MONEY(e.g. amount: thousand dollars), PROPER NOUNS are detected using this technique.
  2. The package word2number is used to convert numbers written as ‘two thousand’ to ‘2000’. Furthermore, few lines of logical code adds suffix/prefix as $/kg,etc. depending upon type of entity.
  3. In some texts entity such as”double X” may occur. In this case, the word double acts as adjective followed by X as noun. To detect such texts along with their corresponding parts of speech spacy Token Matcher is used. Again, after detection of entity few lines of logical code will translate “double X” to “XX”.

<b>The logical code for all functions in this package could be found in file spoken2written/spoken2written/ of this repository</b>

<h1>Bugs/ Errors</h1> Please ensure that you have installed dependency en_core_web_sm of spacy before importing package written2spoken. If you find any bugs/errors in the usage of above code, please raise an issue through <a href=””>GitHub</a>. Else, send an email to <a href=””></a>.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for spoken2written, version 0.1.4
Filename, size File type Python version Upload date Hashes
Filename, size spoken2written-0.1.4-py3-none-any.whl (4.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size spoken2written-0.1.4.tar.gz (4.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page