Library For Spoken to written
Project description
spokenTowritten
A module for converting spoken English to written English. The layout/Class diagram of package is as follows.
Features Implemented:
- Converted numerical words to digits.
- double five == 55
- triple nine eight four == 99984
- one nought one == 101
- double 'B' == BB
- one million == 1M
- Decontraction of words.
- Currency representation.
- Units Abbrevations.
- kilometer == km
- meter == m
Examples:
Let's have a look at some examples to gain more understanding.Below is the piece of spoken text taken for understading.
My mobile number is double nine nine five one six seven triple one.
The cost of mobile is 48 thousand rupees.
It is not easy to crack UPSC examination, people do give double attempts to clear it.
Double standards jokes aren't tolerated here.
My weight is 54 kilogram.I live 16 kilometers away from my office.
Decontraction:
Code
spoken_text = Decontraction.decontraction(spoken_text)
print("Text after decontraction:")
print(spoken_text)
Output
My mobile number is double nine nine five one six seven triple one.
The cost of mobile is 48 thousand rupees.
It is not easy to crack UPSC examination, people do give double attempts to clear it.
Double standards jokes are not tolerated here.
My weight is 54 kilogram.I live 16 kilometers away from my office.
numbers to digit coversion
Code
spoken_text = Text2Digits().convert(spoken_text)
print(spoken_text)
Output
My mobile number is 9995167111.
The cost of mobile is 48000 rupees.
It is not easy to crack UPSC examination, people do give double attempts to clear it.
Double standards jokes are not tolerated here.
My weight is 54 kilogram. I live 16 kilometers away from my office.
Currency Symbol
Code
spoken_text = Currency().currency(spoken_text)
print(spoken_text)
Output
My mobile number is 9995167111.
The cost of mobile is ₹48000.
It is not easy to crack UPSC examination, people do give double attempts to clear it.
Double standards jokes are not tolerated here.
My weight is 54 kilogram. I live 16 kilometers away from my office.
Units abbrevation
Code
# dictionary of unit needs to pass explicitly {unit:abbrevation}
spoken_text = Units.unit(spoken_text,WEIGHTS)
print(spoken_text)
Output
My mobile number is 9995167111.
The cost of mobile is ₹48000.
It is not easy to crack UPSC examination, people do give double attempts to clear it.
Double standards jokes are not tolerated here.
My weight is 54kg. I live 16 kilometers away from my office.
Observe, kilometers didn't got abbravated to kg , as only weights dictionary was passed.
Future Expansion:
The below features can be added by adding respective class without prior testing.
- Text Abbreventions
- Representing text using short form. i.e Thank you so much == tysm
- Central Bureau of Investigation == CBI
- et cetera == etc.
- Mathematical symbol representation.
- plus,minus,division,modulo,integration,differentiation,etc.
- Emoji Implementation.
- Add features for short notes preparation.
Referances:
- https://stackoverflow.com/questions/1471994/what-is-setup-py#targetText=setup.py%20is%20a%20python,%24%20pip%20install%20.
- https://stackoverflow.com/questions/38156956/replace-all-occurrences-that-match-regular-expression
- https://www.easymarkets.com/int/learn-centre/discover-trading/currency-acronyms-and-abbreviations/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spokenTowritten-1.1.tar.gz
(8.0 kB
view details)
File details
Details for the file spokenTowritten-1.1.tar.gz
.
File metadata
- Download URL: spokenTowritten-1.1.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04b02ca97505a58ffc7ff37113cd4746cc052766034bd2e98c92bafdde0b5306 |
|
MD5 | 3c5acc3daf2983b688f0c194451a21a4 |
|
BLAKE2b-256 | 6ba3572782a41695640780fc5cbc259be15072021e131677908f017d3f17ddf2 |