Skip to main content

Inverse of num2words2: convert spoken-form numbers back to numeric values across 100+ languages.

Project description

The inverse of num2words2.

Convert spoken-form numbers (“forty-two”, “trois cent quatre”, “二十三”) back into numeric values across 100+ locales.

  • Hand-written English grammar parser: cardinals, ordinals, decimals, negatives, scale words up to centillion, year forms, “and” conjunctions, hyphenation, ASR-style outputs.

  • Generic backend for every other locale supported by num2words2, derived automatically by reverse-mapping the forward conversion.

  • Sentence-level mode that walks running text and replaces every word-number with its numeric form, preserving punctuation and surrounding text.

Installation

pip install words2num2

num2words2 is a runtime dependency for the generic multi-language backend.

Usage

>>> from words2num2 import words2num, words2num_sentence
>>> words2num("forty-two")
42
>>> words2num("one thousand two hundred thirty-four")
1234
>>> words2num("minus seven")
-7
>>> words2num("three point one four")
Decimal('3.14')
>>> words2num("nineteen ninety nine", to="year")
1999
>>> words2num("twenty-first", to="ordinal")
21
>>> words2num("quarante-deux", lang="fr")
42
>>> words2num("zweiundvierzig", lang="de")
42
>>> words2num("сорок два", lang="ru")
42

>>> words2num_sentence("I bought twenty-three apples and fourteen pears.")
'I bought 23 apples and 14 pears.'

Command-line

$ words2num2 "forty-two"
42
$ words2num2 "trois cent quatre" --lang=fr
304
$ words2num2 "twenty-third" --to=ordinal
23

Supported locales

words2num2 mirrors num2words2’s locale list — 100+ entries including:

af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, ce, cs, cy, da, de, el, en, en_IN, en_NG, eo, es, es_CO, es_CR, es_GT, es_NI, es_VE, et, eu, fa, fi, fo, fr, fr_BE, fr_CH, fr_DZ, gl, gu, ha, haw, he, hi, hr, ht, hu, hy, id, is, it, ja, jw, ka, kk, km, kn, ko, kz, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, pt_BR, ro, ru, sa, sd, si, sk, sl, sn, so, sq, sr, su, sv, sw, ta, te, tet, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, wo, yi, yo, zh, zh_CN, zh_HK, zh_TW

Aliases: jpja, cnzh_CN.

API

words2num(text, lang="en", to="cardinal")

Parse text and return int, float, or Decimal. to is one of cardinal, ordinal, ordinal_num, year, currency.

words2num_sentence(text, lang="en", to="cardinal")

Walk a sentence, replacing every word-number with its numeric form. Returns a string. Aliases: convert_sentence, sentence_to_words.

How it works

  • English (lang_EN) ships a hand-written recursive-descent parser that handles the full grammar — including ordinal/cardinal mixing, decimals, year mode, “a hundred” filler, and negative tokens.

  • Every other locale uses Words2Num_Base, which lazily builds a {normalized_words: integer} table by calling num2words2 for each integer in a configurable range (defaults to -1..10000). This guarantees correctness for the lookup window for every locale supported upstream — at the cost of out-of-range values raising Words2NumError until a hand-written parser is added.

Hand-written parsers can be added incrementally per locale by overriding to_cardinal / to_ordinal in the corresponding lang_XX.py module — same pattern as num2words2.

Development

make install-dev
make test
make lint
make format

License

LGPL-2.1, mirroring num2words2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

words2num2-0.1.0.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

words2num2-0.1.0-py3-none-any.whl (78.7 kB view details)

Uploaded Python 3

File details

Details for the file words2num2-0.1.0.tar.gz.

File metadata

  • Download URL: words2num2-0.1.0.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for words2num2-0.1.0.tar.gz
Algorithm Hash digest
SHA256 41976fa3c42773c3718784bd276958dde63711d74f6aa4194a62a8922be30538
MD5 3baf221bd2f2f325b86a8585ccf399c9
BLAKE2b-256 0f9ed9621d8f7df5408ad65ab7fdda0e776ac380824c88c850c71120371bd411

See more details on using hashes here.

File details

Details for the file words2num2-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: words2num2-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 78.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for words2num2-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c2f0159645c9e7ead74ca15f9b3834ad3ee1f6ff5fa23938471093812e418913
MD5 25bceabb1a3f640e2cf8bb12fe079492
BLAKE2b-256 fe852854edda64615f85b8c9fd784572d1b07c587ee651c193f488eb5a7eef59

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page