Skip to main content

parse numbers written in natural language

Project description

Supported Python Versions

number-parser is a simple library that allows you to convert numbers written in the natural language to it’s equivalent numeric forms. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish and Russian and ordinal numbers in English.

Installation

pip install number-parser

number-parser requires Python 3.6+.

Usage

The library provides three major APIs which corresponds to the following common usages.

Interface #1: Multiple numbers

Identifying the numbers in a text string, converting them to corresponding numeric values while ignoring non-numeric words. This also supports ordinal number conversion (for English only).

>>> from number_parser import parse
>>> parse("I have two hats and thirty seven coats")
'I have 2 hats and 37 coats'
>>> parse("One, Two, Three go")
'1, 2, 3 go'
>>> parse("First day of year two thousand")
'1 day of year 2000'

Interface #2: Single number

Converting a single number written in words to it’s corresponding integer.

>>> from number_parser import parse_number
>>> parse_number("two thousand and twenty")
2020
>>> parse_number("not_a_number")

Interface #3: Single number Ordinal

Converting a single ordinal number written in words to it’s corresponding integer. (Support for only English)

>>> from number_parser import parse_ordinal
>>> parse_ordinal("twenty third")
23
>>> parse_ordinal("seventy fifth")
75

Language Support

The default language is English, you can pass the language parameter with corresponding locale for other languages. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish and Russian and ordinal numbers in English.

>>> from number_parser import parse, parse_number
>>> parse("Hay tres gallinas y veintitrés patos", language='es')
'Hay 3 gallinas y 23 patos'
>>> parse_number("चौदह लाख बत्तीस हज़ार पाँच सौ चौबीस", language='hi')
1432524

Supported cases

The library has extensive tests. Some of the supported cases are described below.

Accurately handling usage of conjunction while forming the number.

>>> parse("doscientos cincuenta y doscientos treinta y uno y doce", language='es')
'250 y 231 y 12'

Handling ambiguous cases without proper separators.

>>> parse("two thousand thousand")
'2000 1000'
>>> parse_number("two thousand two million")
2002000000

Handling nuances in the languag ith different forms of the same number.

>>> parse_number("пятисот девяноста шести", language='ru')
596
>>> parse_number("пятистам девяноста шести", language='ru')
596
>>> parse_number("пятьсот девяносто шесть", language='ru')
596

Contributing

Changes

0.1.0 (2020-07-30)

Initial release.

0.2.0 (2020-08-18)

Ordinal Number Support

0.2.1 (2020-08-25)

Fix tokenization bug - Hindi

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

number-parser-0.2.1.tar.gz (45.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

number_parser-0.2.1-py2.py3-none-any.whl (50.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file number-parser-0.2.1.tar.gz.

File metadata

  • Download URL: number-parser-0.2.1.tar.gz
  • Upload date:
  • Size: 45.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for number-parser-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f9a4eacc94cf977c51f07b589539569c652cfa8c352703ff900b68c8449c2d3e
MD5 f241b154cda859b69cb58dc550bad7ec
BLAKE2b-256 6495a08af5e623ad0ab09f9db8b528c94ed0d771c112a0b6dc89af3b21ffd2c7

See more details on using hashes here.

File details

Details for the file number_parser-0.2.1-py2.py3-none-any.whl.

File metadata

  • Download URL: number_parser-0.2.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 50.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9

File hashes

Hashes for number_parser-0.2.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 584f58deb8d20e0caac4c98294e89fd9150b89d82c300bbf6b456f24a3daf8ee
MD5 85ebb38990601d47233dab073ea52303
BLAKE2b-256 1de24c5f5e7212cde2cc7464d54765914f0a923503d2c352e7028a8cc730413e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page