Skip to main content

Inverse of num2words2: convert spoken-form numbers back to numeric values across 100+ languages.

Project description

The inverse of num2words2.

Convert spoken-form numbers (“forty-two”, “trois cent quatre”, “二十三”) back into numeric values across 100+ locales.

  • Hand-written English grammar parser: cardinals, ordinals, decimals, negatives, scale words up to centillion, year forms, “and” conjunctions, hyphenation, ASR-style outputs.

  • Generic backend for every other locale supported by num2words2, derived automatically by reverse-mapping the forward conversion.

  • Sentence-level mode that walks running text and replaces every word-number with its numeric form, preserving punctuation and surrounding text.

Installation

pip install words2num2

num2words2 is a runtime dependency for the generic multi-language backend.

Usage

>>> from words2num2 import words2num, words2num_sentence
>>> words2num("forty-two")
42
>>> words2num("one thousand two hundred thirty-four")
1234
>>> words2num("minus seven")
-7
>>> words2num("three point one four")
Decimal('3.14')
>>> words2num("nineteen ninety nine", to="year")
1999
>>> words2num("twenty-first", to="ordinal")
21
>>> words2num("quarante-deux", lang="fr")
42
>>> words2num("zweiundvierzig", lang="de")
42
>>> words2num("сорок два", lang="ru")
42

>>> words2num_sentence("I bought twenty-three apples and fourteen pears.")
'I bought 23 apples and 14 pears.'

Auto-parse: numbers + currencies + units + locales

auto_parse extracts a numeric value plus its unit from any free-text expression. auto_parse_sentence walks running text and replaces every quantity in place. Supports configurable thousands/decimal separators per locale, currency symbols ($€£¥₹₽₩₺) and ISO codes (USD, EUR, GBP, …), scale shortcuts ($5m, $1.5b), SI/imperial units (length/mass/temperature/time/volume), percent, and disambiguation hints.

>>> from words2num2 import auto_parse, auto_parse_sentence

>>> auto_parse("$12,345.00")
Quantity(value=12345.0, unit='USD', kind='currency', confidence=1.0)
>>> auto_parse("$5m").value
5000000
>>> auto_parse("5cm")
Quantity(value=5, unit='cm', kind='length', confidence=1.0)
>>> auto_parse("20°C").kind
'temperature'
>>> auto_parse("forty-two kg").value
42

# Configurable separators
>>> auto_parse("1.234,56", lang="de").value
1234.56
>>> auto_parse("1 234,56", lang="fr").value
1234.56

# Disambiguation
>>> auto_parse("5m", prefer={"m": "mile"}).unit_long
'mile'

# Sentence mode
>>> auto_parse_sentence("Pay $12.50 for 5kg of apples at -5°C.")
'Pay 12.5 USD for 5 kg of apples at -5 °C.'
>>> auto_parse_sentence("Pay $12.50 for 5kg.", expand=True)
'Pay 12.5 dollar for 5 kilogram.'

Command-line

$ words2num2 "forty-two"
42
$ words2num2 "trois cent quatre" --lang=fr
304
$ words2num2 "twenty-third" --to=ordinal
23

Supported locales

words2num2 mirrors num2words2’s locale list — 100+ entries including:

af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, ce, cs, cy, da, de, el, en, en_IN, en_NG, eo, es, es_CO, es_CR, es_GT, es_NI, es_VE, et, eu, fa, fi, fo, fr, fr_BE, fr_CH, fr_DZ, gl, gu, ha, haw, he, hi, hr, ht, hu, hy, id, is, it, ja, jw, ka, kk, km, kn, ko, kz, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, pt_BR, ro, ru, sa, sd, si, sk, sl, sn, so, sq, sr, su, sv, sw, ta, te, tet, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, wo, yi, yo, zh, zh_CN, zh_HK, zh_TW

Aliases: jpja, cnzh_CN.

API

words2num(text, lang="en", to="cardinal")

Parse text and return int, float, or Decimal. to is one of cardinal, ordinal, ordinal_num, year, currency.

words2num_sentence(text, lang="en", to="cardinal")

Walk a sentence, replacing every word-number with its numeric form. Returns a string. Aliases: convert_sentence, sentence_to_words.

How it works

  • English (lang_EN) ships a hand-written recursive-descent parser that handles the full grammar — including ordinal/cardinal mixing, decimals, year mode, “a hundred” filler, and negative tokens.

  • Every other locale uses Words2Num_Base, which lazily builds a {normalized_words: integer} table by calling num2words2 for each integer in a configurable range (defaults to -1..10000). This guarantees correctness for the lookup window for every locale supported upstream — at the cost of out-of-range values raising Words2NumError until a hand-written parser is added.

Hand-written parsers can be added incrementally per locale by overriding to_cardinal / to_ordinal in the corresponding lang_XX.py module — same pattern as num2words2.

Development

make install-dev
make test
make lint
make format

License

LGPL-2.1, mirroring num2words2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

words2num2-0.2.0.tar.gz (43.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

words2num2-0.2.0-py3-none-any.whl (87.2 kB view details)

Uploaded Python 3

File details

Details for the file words2num2-0.2.0.tar.gz.

File metadata

  • Download URL: words2num2-0.2.0.tar.gz
  • Upload date:
  • Size: 43.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for words2num2-0.2.0.tar.gz
Algorithm Hash digest
SHA256 48dda32aa390838fe4a36e720c12e056855d642f277369b9b12271e2923d61e0
MD5 88a738185c51f13b5887149d4964a115
BLAKE2b-256 01b62747b0c183d51713c0c4c834c66ada88612f61591544f18da2cd49fb1872

See more details on using hashes here.

Provenance

The following attestation bundles were made for words2num2-0.2.0.tar.gz:

Publisher: release.yml on jqueguiner/words2num2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file words2num2-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: words2num2-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 87.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for words2num2-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6c6e2b65b58ab9e7b0dfb36c2b9cadfa67285c199dae5a43ada57435b6c74610
MD5 5b4f29f1ad5466ae780dfbb9e4a9a53e
BLAKE2b-256 8b852aab46b40e2b7e28a150cc6a85da6ba9ffbd4b0c71544be7509e0a8e2d0f

See more details on using hashes here.

Provenance

The following attestation bundles were made for words2num2-0.2.0-py3-none-any.whl:

Publisher: release.yml on jqueguiner/words2num2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page