Skip to main content

Inverse of num2words2: convert spoken-form numbers back to numeric values across 100+ languages.

Project description

The inverse of num2words2.

Convert spoken-form numbers (“forty-two”, “trois cent quatre”, “二十三”) back into numeric values across 100+ locales.

  • Hand-written English grammar parser: cardinals, ordinals, decimals, negatives, scale words up to centillion, year forms, “and” conjunctions, hyphenation, ASR-style outputs.

  • Generic backend for every other locale supported by num2words2, derived automatically by reverse-mapping the forward conversion.

  • Sentence-level mode that walks running text and replaces every word-number with its numeric form, preserving punctuation and surrounding text.

Installation

pip install words2num2

num2words2 is a runtime dependency for the generic multi-language backend.

Usage

>>> from words2num2 import words2num, words2num_sentence
>>> words2num("forty-two")
42
>>> words2num("one thousand two hundred thirty-four")
1234
>>> words2num("minus seven")
-7
>>> words2num("three point one four")
Decimal('3.14')
>>> words2num("nineteen ninety nine", to="year")
1999
>>> words2num("twenty-first", to="ordinal")
21
>>> words2num("quarante-deux", lang="fr")
42
>>> words2num("zweiundvierzig", lang="de")
42
>>> words2num("сорок два", lang="ru")
42

>>> words2num_sentence("I bought twenty-three apples and fourteen pears.")
'I bought 23 apples and 14 pears.'

Auto-parse: numbers + currencies + units + locales

auto_parse extracts a numeric value plus its unit from any free-text expression. auto_parse_sentence walks running text and replaces every quantity in place. Supports configurable thousands/decimal separators per locale, currency symbols ($€£¥₹₽₩₺) and ISO codes (USD, EUR, GBP, …), scale shortcuts ($5m, $1.5b), SI/imperial units (length/mass/temperature/time/volume), percent, and disambiguation hints.

>>> from words2num2 import auto_parse, auto_parse_sentence

>>> auto_parse("$12,345.00")
Quantity(value=12345.0, unit='USD', kind='currency', confidence=1.0)
>>> auto_parse("$5m").value
5000000
>>> auto_parse("5cm")
Quantity(value=5, unit='cm', kind='length', confidence=1.0)
>>> auto_parse("20°C").kind
'temperature'
>>> auto_parse("forty-two kg").value
42

# Configurable separators
>>> auto_parse("1.234,56", lang="de").value
1234.56
>>> auto_parse("1 234,56", lang="fr").value
1234.56

# Disambiguation
>>> auto_parse("5m", prefer={"m": "mile"}).unit_long
'mile'

# Sentence mode
>>> auto_parse_sentence("Pay $12.50 for 5kg of apples at -5°C.")
'Pay 12.5 USD for 5 kg of apples at -5 °C.'
>>> auto_parse_sentence("Pay $12.50 for 5kg.", expand=True)
'Pay 12.5 dollar for 5 kilogram.'

Command-line

$ words2num2 "forty-two"
42
$ words2num2 "trois cent quatre" --lang=fr
304
$ words2num2 "twenty-third" --to=ordinal
23

Supported locales

words2num2 mirrors num2words2’s locale list — 100+ entries including:

af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, ce, cs, cy, da, de, el, en, en_IN, en_NG, eo, es, es_CO, es_CR, es_GT, es_NI, es_VE, et, eu, fa, fi, fo, fr, fr_BE, fr_CH, fr_DZ, gl, gu, ha, haw, he, hi, hr, ht, hu, hy, id, is, it, ja, jw, ka, kk, km, kn, ko, kz, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, pt_BR, ro, ru, sa, sd, si, sk, sl, sn, so, sq, sr, su, sv, sw, ta, te, tet, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, wo, yi, yo, zh, zh_CN, zh_HK, zh_TW

Aliases: jpja, cnzh_CN.

API

words2num(text, lang="en", to="cardinal")

Parse text and return int, float, or Decimal. to is one of cardinal, ordinal, ordinal_num, year, currency.

words2num_sentence(text, lang="en", to="cardinal")

Walk a sentence, replacing every word-number with its numeric form. Returns a string. Aliases: convert_sentence, sentence_to_words.

How it works

  • English (lang_EN) ships a hand-written recursive-descent parser that handles the full grammar — including ordinal/cardinal mixing, decimals, year mode, “a hundred” filler, and negative tokens.

  • Every other locale uses Words2Num_Base, which lazily builds a {normalized_words: integer} table by calling num2words2 for each integer in a configurable range (defaults to -1..10000). This guarantees correctness for the lookup window for every locale supported upstream — at the cost of out-of-range values raising Words2NumError until a hand-written parser is added.

Hand-written parsers can be added incrementally per locale by overriding to_cardinal / to_ordinal in the corresponding lang_XX.py module — same pattern as num2words2.

Development

make install-dev
make test
make lint
make format

License

LGPL-2.1, mirroring num2words2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

words2num2-0.2.1.tar.gz (44.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

words2num2-0.2.1-py3-none-any.whl (87.7 kB view details)

Uploaded Python 3

File details

Details for the file words2num2-0.2.1.tar.gz.

File metadata

  • Download URL: words2num2-0.2.1.tar.gz
  • Upload date:
  • Size: 44.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for words2num2-0.2.1.tar.gz
Algorithm Hash digest
SHA256 6f31e3eda125d8d7867437cab79dfcfcfb904d27f97242ac7155936ccbbfd964
MD5 134efda36bfb807568d8582bc06c0f67
BLAKE2b-256 3d4fb7065042eb9f13a7dc99817feff23dfbee123a36b002832552158cdc67d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for words2num2-0.2.1.tar.gz:

Publisher: release.yml on jqueguiner/words2num2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file words2num2-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: words2num2-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 87.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for words2num2-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 852c975667e1089e2d28550fa4c556b035701f83de1070342767462a578d0440
MD5 64ac48043916ded5195841c1ef2c3e26
BLAKE2b-256 03c95e0b8f0430b445558e0529604605d8c2cb1353a7799f4870e9f6567d63f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for words2num2-0.2.1-py3-none-any.whl:

Publisher: release.yml on jqueguiner/words2num2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page