Skip to main content

Converting number formats

Project description

numericnormalizer

This is a basic library used for NLP that can perform conversions between numbers in numerical format and alphabetical / character format.

Installation

pip install numericnormalizer

Usage

importing the module

from numericnormalizer import normalizer

Convert a number to a word (i.e. 5 -> 'five')

normalizer.number_to_word(5, lang='en')
>> "five"


normalizer.number_to_word(5, lang='zh')
>> "五"

Convert a word to a number (i.e. 'five' -> 5)

normalizer.word_to_number('five', lang='en')
>> 5


normalizer.number_to_word('五', lang='zh')
>> 5

Format numbers in a sentence

Example 1: default formatting

normalizer.format_sentence(
    sentence='What are the 6 principles of intercultural adaption?',
    lang='zh'
)
>> "What are the six (6) principles of intercultural adaption?"

Example 2: Custom Formatting

normalizer.format_sentence(
    sentence='I have 4 apples and five oranges.',
    lang='zh',
    formatting='{number} [{word}]',  # custom formatting
)
>> "I have 4 [four] apples and 5 [five] oranges."

Example 3: Number restricting

normalizer.format_sentence(
    sentence='I have 4 apples and five oranges.',
    lang='zh',
    max_number=4  # restrict the max_number
)
>> "I have four (4) apples and five oranges."

Language Support

The supported languages are from the Azure Language Detect List:

  • Afrikaans (af)
  • Albanian (sq)
  • Amharic (am)
  • Arabic (ar)
  • Armenian (hy)
  • Assamese (as)
  • Azerbaijani (az)
  • Bashkir (ba)
  • Basque (eu)
  • Belarusian (be)
  • Bengali (bn)
  • Bosnian (bs)
  • Bulgarian (bg)
  • Burmese (my)
  • Catalan (ca)
  • Central Khmer (km)
  • Chinese (zh)
  • Chinese Simplified (zh_chs)
  • Chinese Traditional (zh_cht)
  • Chuvash (cv)
  • Corsican (co)
  • Croatian (hr)
  • Czech (cs)
  • Danish (da)
  • Dari (prs)
  • Divehi (dv)
  • Dutch (nl)
  • English (en)
  • Esperanto (eo)
  • Estonian (et)
  • Faroese (fo)
  • Fijian (fj)
  • Finnish (fi)
  • French (fr)
  • Galician (gl)
  • Georgian (ka)
  • German (de)
  • Greek (el)
  • Gujarati (gu)
  • Haitian (ht)
  • Hausa (ha)
  • Hebrew (he)
  • Hindi (hi)
  • Hmong Daw (mww)
  • Hungarian (hu)
  • Icelandic (is)
  • Igbo (ig)
  • Indonesian (id)
  • Inuktitut (iu)
  • Irish (ga)
  • Italian (it)
  • Japanese (ja)
  • Javanese (jv)
  • Kannada (kn)
  • Kazakh (kk)
  • Kinyarwanda (rw)
  • Kirghiz (ky)
  • Korean (ko)
  • Kurdish (ku)
  • Lao (lo)
  • Latin (la)
  • Latvian (lv)
  • Lithuanian (lt)
  • Luxembourgish (lb)
  • Macedonian (mk)
  • Malagasy (mg)
  • Malay (ms)
  • Malayalam (ml)
  • Maltese (mt)
  • Maori (mi)
  • Marathi (mr)
  • Mongolian (mn)
  • Nepali (ne)
  • Norwegian (no)
  • Norwegian Nynorsk (nn)
  • Odia (or)
  • Pasht (ps)
  • Persian (fa)
  • Polish (pl)
  • Portuguese (pt)
  • Punjabi (pa)
  • Queretaro Otomi (otq)
  • Romanian (ro)
  • Russian (ru)
  • Samoan (sm)
  • Serbian (sr)
  • Shona (sn)
  • Sindhi (sd)
  • Sinhala (si)
  • Slovak (sk)
  • Slovenian (sl)
  • Somali (so)
  • Spanish (es)
  • Sundanese (su)
  • Swahili (sw)
  • Swedish (sv)
  • Tagalog (tl)
  • Tahitian (ty)
  • Tajik (tg)
  • Tamil (ta)
  • Tatar (tt)
  • Telugu (te)
  • Thai (th)
  • Tibetan (bo)
  • Tigrinya (ti)
  • Tongan (to)
  • Turkish (tr)
  • Turkmen (tk)
  • Upper Sorbian (hsb)
  • Uyghur (ug)
  • Ukrainian (uk)
  • Urdu (ur)
  • Uzbek (uz)
  • Vietnamese (vi)
  • Welsh (cy)
  • Xhosa (xh)
  • Yiddish (yi)
  • Yoruba (yo)
  • Yucatec Maya (yua)
  • Zulu (zu)

However for the format_sentence feature, as this is an early release not all languages have been tested thoroughly. It is currently designed to only check languages that deal with spaces as it relies on regex word match notation

Number support

Currently only support numbers 0 - 10. No negatives.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numericnormalizer-0.1.0.tar.gz (14.7 kB view hashes)

Uploaded Source

Built Distribution

numericnormalizer-0.1.0-py3-none-any.whl (12.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page