Skip to main content

Converting number formats

Project description

numericnormalizer

This is a basic library used for NLP that can perform conversions between numbers in numerical format and alphabetical / character format.

Installation

pip install numericnormalizer

Usage

importing the module

from numericnormalizer import normalizer

Convert a number to a word (i.e. 5 -> 'five')

normalizer.number_to_word(5, lang='en')
>> "five"


normalizer.number_to_word(5, lang='zh')
>> "五"

Convert a word to a number (i.e. 'five' -> 5)

normalizer.word_to_number('five', lang='en')
>> 5


normalizer.number_to_word('五', lang='zh')
>> 5

Format numbers in a sentence

Example 1: default formatting

normalizer.format_sentence(
    sentence='What are the 6 principles of intercultural adaption?',
    lang='zh'
)
>> "What are the six (6) principles of intercultural adaption?"

Example 2: Custom Formatting

normalizer.format_sentence(
    sentence='I have 4 apples and five oranges.',
    lang='zh',
    formatting='{number} [{word}]',  # custom formatting
)
>> "I have 4 [four] apples and 5 [five] oranges."

Example 3: Number restricting

normalizer.format_sentence(
    sentence='I have 4 apples and five oranges.',
    lang='zh',
    max_number=4  # restrict the max_number
)
>> "I have four (4) apples and five oranges."

Language Support

The supported languages are from the Azure Language Detect List:

  • Afrikaans (af)
  • Albanian (sq)
  • Amharic (am)
  • Arabic (ar)
  • Armenian (hy)
  • Assamese (as)
  • Azerbaijani (az)
  • Bashkir (ba)
  • Basque (eu)
  • Belarusian (be)
  • Bengali (bn)
  • Bosnian (bs)
  • Bulgarian (bg)
  • Burmese (my)
  • Catalan (ca)
  • Central Khmer (km)
  • Chinese (zh)
  • Chinese Simplified (zh_chs)
  • Chinese Traditional (zh_cht)
  • Chuvash (cv)
  • Corsican (co)
  • Croatian (hr)
  • Czech (cs)
  • Danish (da)
  • Dari (prs)
  • Divehi (dv)
  • Dutch (nl)
  • English (en)
  • Esperanto (eo)
  • Estonian (et)
  • Faroese (fo)
  • Fijian (fj)
  • Finnish (fi)
  • French (fr)
  • Galician (gl)
  • Georgian (ka)
  • German (de)
  • Greek (el)
  • Gujarati (gu)
  • Haitian (ht)
  • Hausa (ha)
  • Hebrew (he)
  • Hindi (hi)
  • Hmong Daw (mww)
  • Hungarian (hu)
  • Icelandic (is)
  • Igbo (ig)
  • Indonesian (id)
  • Inuktitut (iu)
  • Irish (ga)
  • Italian (it)
  • Japanese (ja)
  • Javanese (jv)
  • Kannada (kn)
  • Kazakh (kk)
  • Kinyarwanda (rw)
  • Kirghiz (ky)
  • Korean (ko)
  • Kurdish (ku)
  • Lao (lo)
  • Latin (la)
  • Latvian (lv)
  • Lithuanian (lt)
  • Luxembourgish (lb)
  • Macedonian (mk)
  • Malagasy (mg)
  • Malay (ms)
  • Malayalam (ml)
  • Maltese (mt)
  • Maori (mi)
  • Marathi (mr)
  • Mongolian (mn)
  • Nepali (ne)
  • Norwegian (no)
  • Norwegian Nynorsk (nn)
  • Odia (or)
  • Pasht (ps)
  • Persian (fa)
  • Polish (pl)
  • Portuguese (pt)
  • Punjabi (pa)
  • Queretaro Otomi (otq)
  • Romanian (ro)
  • Russian (ru)
  • Samoan (sm)
  • Serbian (sr)
  • Shona (sn)
  • Sindhi (sd)
  • Sinhala (si)
  • Slovak (sk)
  • Slovenian (sl)
  • Somali (so)
  • Spanish (es)
  • Sundanese (su)
  • Swahili (sw)
  • Swedish (sv)
  • Tagalog (tl)
  • Tahitian (ty)
  • Tajik (tg)
  • Tamil (ta)
  • Tatar (tt)
  • Telugu (te)
  • Thai (th)
  • Tibetan (bo)
  • Tigrinya (ti)
  • Tongan (to)
  • Turkish (tr)
  • Turkmen (tk)
  • Upper Sorbian (hsb)
  • Uyghur (ug)
  • Ukrainian (uk)
  • Urdu (ur)
  • Uzbek (uz)
  • Vietnamese (vi)
  • Welsh (cy)
  • Xhosa (xh)
  • Yiddish (yi)
  • Yoruba (yo)
  • Yucatec Maya (yua)
  • Zulu (zu)

However for the format_sentence feature, as this is an early release not all languages have been tested thoroughly. It is currently designed to only check languages that deal with spaces as it relies on regex word match notation

Number support

Currently only support numbers 0 - 10. No negatives.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numericnormalizer-0.1.0.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

numericnormalizer-0.1.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file numericnormalizer-0.1.0.tar.gz.

File metadata

  • Download URL: numericnormalizer-0.1.0.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for numericnormalizer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 269df4cdd8b15493ac81a3e34f63ea8c8db4d359b598c9cbbe433e2e4a3bd373
MD5 e20589ec00d1d6962c733ee0e6f05024
BLAKE2b-256 3f29e1ce5c21264ff5e7e811699502cae8fc7104eed62efe63329af4f4a37e63

See more details on using hashes here.

File details

Details for the file numericnormalizer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for numericnormalizer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 712370a059bb3338a4f8f83590f759ba6b2367e8fbee8a68a414273666c61ea5
MD5 bac6b269f9cf3543e7ff67ae4509d992
BLAKE2b-256 aa5b2e4298c89cd73ae3c4a089a21fa855880e2b503a95a51870cf50314280e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page