parse numbers written in natural language
Project description
number-parser is a simple library that allows you to convert numbers written in the natural language to it’s equivalent numeric forms. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish and Russian.
Installation
pip install number-parser
number-parser requires Python 3.6+.
Usage
The library provides two major APIs which corresponds to the following two common usages.
Interface #1: Multiple numbers
Identifying the numbers in a text string, converting them to corresponding numeric values while ignoring non-numeric words.
>>> from number_parser import parse >>> parse("I have two hats and thirty seven coats") 'I have 2 hats and 37 coats' >>> parse("One, Two, Three go") '1, 2, 3 go'
Interface #2: Single number
Converting a single number written in words to it’s corresponding integer.
>>> from number_parser import parse_number >>> parse_number("two thousand and twenty") 2020 >>> output = parse_number("not_a_number") >>> output None
Language Support
The default language is English, you can pass the language parameter with corresponding locale for other languages.
>>> from number_parser import parse, parse_number >>> parse("Hay tres gallinas y veintitrés patos", language='es') 'Hay 3 gallinas y 23 patos' >>> parse_number("चौदह लाख बत्तीस हज़ार पाँच सौ चौबीस", language='hi') 1432524
Supported cases
The library has extensive tests. Some of the supported cases are described below.
Accurately handling usage of conjunction while forming the number.
>>> parse("doscientos cincuenta y doscientos treinta y uno y doce", language='es') '250 y 231 y 12'
Handling ambiguous cases without proper separators.
>>> parse("two thousand thousand") 2000 1000 >>> parse_number("two thousand two million") 2002000000
Handling nuances in the languag ith different forms of the same number.
>>> parse_number("пятисот девяноста шести", language='ru') 596 >>> parse_number("пятистам девяноста шести", language='ru') 596 >>> parse_number("пятьсот девяносто шесть", language='ru') 596
Contributing
Source code: https://github.com/arnavkapoor/number-parser
Issue tracker: https://github.com/arnavkapoor/number-parser/issues
Changes
0.1.0 (2019-07-30)
Initial release.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for number_parser-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a422c290b6f2a8e14dc027617b4d0f6578e7fa3d1fbbbee3ac7c1270f80e295e |
|
MD5 | a27f1e8241ac76d28641b70276c3d016 |
|
BLAKE2b-256 | 008237a5a8a5e15518c8109889d28b7a96b40ab14fa611fee6b8ece7e164f7c3 |