Modern, actively maintained fork of num2words optimized for LLM/AI/speech applications.
Project description
num2words2 is a modern, actively maintained fork of the original num2words library that converts numbers like 42 to words like forty-two. It supports multiple languages (see the list below for full list of languages) and can even generate ordinal numbers like forty-second. This fork was created to address the maintenance gap in the original project and optimize for modern AI/LLM/speech applications.
The project is hosted on GitHub. Contributions are welcome.
Installation
The easiest way to install num2words2 is to use pip:
pip install num2words2
Otherwise, you can download the source package and then execute:
python setup.py install
Development Setup
The project uses pre-commit hooks to ensure code quality. To set up your development environment:
# Install pre-commit pip install pre-commit # Install the git hook scripts pre-commit install # Run hooks on all files (optional, useful for initial setup) pre-commit run --all-files
This will automatically format and lint your code before each commit using:
autopep8 - PEP 8 formatting
autoflake - removes unused imports and variables
isort - sorts imports
flake8 - style and quality checks
trailing-whitespace removal
end-of-file fixing
Testing
The library uses pytest for testing. First, install the development dependencies:
make dev-install
Then, you can run the test suite using several methods:
Run basic tests: This runs tests with your current Python environment.
make testRun with Tox: This runs tests against all supported Python versions, which is the standard for CI.
tox
Generating End-to-End Tests with LLMs
The repository includes a powerful script to generate high-quality, realistic test cases using Large Language Models (LLMs). This helps ensure accuracy across multiple languages and complex scenarios.
What it does: The tests/scripts/generate_llm_tests.py script uses an LLM (like GPT-4o) to create sentences containing numbers, dates, and currencies, and then generates the expected word-for-word conversion.
Requirements:
An OpenAI API key. You must set it as an environment variable: export OPENAI_API_KEY='your-key-here'
How to Use:
To generate 10 new test sentences for French and Spanish, you can run:
python tests/scripts/generate_llm_tests.py --languages fr,es --samples 10
The new tests will be appended to tests/data/e2e_test_sentences.csv.
Key Options:
--languages: Comma-separated list of language codes (e.g., en_IN,de,it).
--samples: Number of samples to generate per language.
--mode: Use sentences for full sentences or numbers for direct number-to-word conversions.
--model: The OpenAI model to use (e.g., gpt-4o, gpt-4o-mini).
--output: Specify a different output file.
--overwrite: Overwrite the output file instead of appending.
This tool is essential for expanding test coverage and ensuring the library’s robustness.
Usage
Command line:
$ num2words2 10001 ten thousand and one $ num2words2 24,120.10 twenty-four thousand, one hundred and twenty point one $ num2words2 24,120.10 -l es veinticuatro mil ciento veinte punto uno $ num2words2 2.14 -l es --to currency dos euros con catorce céntimos
In code there’s only one function to use:
>>> from num2words2 import num2words >>> num2words(42) forty-two >>> num2words(42, to='ordinal') forty-second >>> num2words(42, lang='fr') quarante-deux
Besides the numerical argument, there are two main optional arguments, to: and lang:
to: The converter to use. Supported values are:
cardinal (default)
ordinal
ordinal_num
year
currency
lang: The language in which to convert the number. Supported values are:
en (English, default)
am (Amharic)
ar (Arabic)
az (Azerbaijani)
be (Belarusian)
bn (Bangladeshi)
ca (Catalan)
ce (Chechen)
cs (Czech)
cy (Welsh)
da (Danish)
de (German)
en_GB (English - Great Britain)
en_IN (English - India)
en_NG (English - Nigeria)
es (Spanish)
es_CO (Spanish - Colombia)
es_CR (Spanish - Costa Rica)
es_GT (Spanish - Guatemala)
es_VE (Spanish - Venezuela)
eu (EURO)
fa (Farsi)
fi (Finnish)
fr (French)
fr_BE (French - Belgium)
fr_CH (French - Switzerland)
fr_DZ (French - Algeria)
he (Hebrew)
hi (Hindi)
hu (Hungarian)
hy (Armenian)
id (Indonesian)
is (Icelandic)
it (Italian)
ja (Japanese)
kn (Kannada)
ko (Korean)
kz (Kazakh)
mn (Mongolian)
lt (Lithuanian)
lv (Latvian)
nl (Dutch)
no (Norwegian)
pl (Polish)
pt (Portuguese)
pt_BR (Portuguese - Brazilian)
ro (Romanian)
ru (Russian)
sl (Slovene)
sk (Slovak)
sr (Serbian)
sv (Swedish)
te (Telugu)
tet (Tetum)
tg (Tajik)
tr (Turkish)
th (Thai)
uk (Ukrainian)
vi (Vietnamese)
zh (Chinese - Traditional)
zh_CN (Chinese - Simplified / Mainland China)
zh_TW (Chinese - Traditional / Taiwan)
zh_HK (Chinese - Traditional / Hong Kong)
You can supply values like fr_FR; if the country doesn’t exist but the language does, the code will fall back to the base language (i.e. fr). If you supply an unsupported language, NotImplementedError is raised. Therefore, if you want to call num2words with a fallback, you can do:
try:
return num2words(42, lang=mylang)
except NotImplementedError:
return num2words(42, lang='en')
Additionally, some converters and languages support other optional arguments that are needed to make the converter useful in practice.
Wiki
For additional information on some localization please check the Wiki. And feel free to propose wiki enhancement.
History
num2words is based on an old library, pynum2word, created by Taro Ogawa in 2003. Unfortunately, the library stopped being maintained and the author can’t be reached. There was another developer, Marius Grigaitis, who in 2011 added Lithuanian support, but didn’t take over maintenance of the project.
Virgil Dupras from Savoir-faire Linux based himself on Marius Grigaitis’ improvements and re-published pynum2word as num2words.
num2words2 Fork
num2words2 is a modern fork of the original num2words library, created to address the maintenance gap and optimize for modern AI/LLM/speech applications. This fork:
Provides active maintenance aligned with rapidly evolving AI/ML ecosystem
Fixes critical bugs affecting machine learning pipelines
Adds enhanced language support for global AI applications
Maintains backward compatibility with the original library
Jean-Louis Queguiner
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file num2words2-1.0.16.tar.gz.
File metadata
- Download URL: num2words2-1.0.16.tar.gz
- Upload date:
- Size: 657.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67759dce93d1deee112014f501f22ca06aeab4beb57e9366314a0eb8be1e139f
|
|
| MD5 |
4342fcf2a527080cd7592401ca90cc5c
|
|
| BLAKE2b-256 |
5820db81197fcf2c98cc321b7d8427a01f6cf7388eb3d79978cc52ab852509a6
|
Provenance
The following attestation bundles were made for num2words2-1.0.16.tar.gz:
Publisher:
release.yml on jqueguiner/num2words2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
num2words2-1.0.16.tar.gz -
Subject digest:
67759dce93d1deee112014f501f22ca06aeab4beb57e9366314a0eb8be1e139f - Sigstore transparency entry: 1418824429
- Sigstore integration time:
-
Permalink:
jqueguiner/num2words2@a78448c010e72d7feb82375d28b17125490e0177 -
Branch / Tag:
refs/tags/v1.0.16 - Owner: https://github.com/jqueguiner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a78448c010e72d7feb82375d28b17125490e0177 -
Trigger Event:
push
-
Statement type:
File details
Details for the file num2words2-1.0.16-py3-none-any.whl.
File metadata
- Download URL: num2words2-1.0.16-py3-none-any.whl
- Upload date:
- Size: 420.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75b19520a68c5c7574d4eb6a3ca09152594549260c3e3cb7b11ad1b1584b296c
|
|
| MD5 |
18a86a30083dd06f27a58dcd88153f4a
|
|
| BLAKE2b-256 |
fe8f511c696d7868fe5ddb75d3d0a1a2a339bb3b76b89167d663609ce6faaba6
|
Provenance
The following attestation bundles were made for num2words2-1.0.16-py3-none-any.whl:
Publisher:
release.yml on jqueguiner/num2words2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
num2words2-1.0.16-py3-none-any.whl -
Subject digest:
75b19520a68c5c7574d4eb6a3ca09152594549260c3e3cb7b11ad1b1584b296c - Sigstore transparency entry: 1418824554
- Sigstore integration time:
-
Permalink:
jqueguiner/num2words2@a78448c010e72d7feb82375d28b17125490e0177 -
Branch / Tag:
refs/tags/v1.0.16 - Owner: https://github.com/jqueguiner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a78448c010e72d7feb82375d28b17125490e0177 -
Trigger Event:
push
-
Statement type: