Skip to main content

Extract numbers from a string

Project description

Numbers from String

This Python module provides functions that get the numbers or the numeric string tokens in an input string.

To capture the numerals in a piece of text is a common preprocess for retrieving numerical information from documents. However, due to the various representations of these numerals, it's somewhat tricky to capture them all using simple rules. We packed several regex rules with comprehensive coverage in this library and hope it can be a useful tool for NLP researchers.

Installation

pip install nums_from_string

Usage

  1. Extract numbers from a string
>>> string1 = "U.S. goods and services trade with China totaled an estimated $710.4 billion in 2017. "
>>> nums_from_string.get_nums(string1)
[710.4, 2017]

>>> string2 = "David spent .25 billion dollars buying a building and 600,000.5 dollars getting himself a car."
>>> nums_from_string.get_nums(string2)
[0.25, 600000.5]
  1. Extract numeric strings from a string
>>> string1 = "U.S. goods and services trade with China totaled an estimated $710.4 billion in 2017. "
>>> nums_from_string.get_numeric_string_tokens(string1)
['710.4', '2017']

>>> string2 = "David spent .25 billion dollars buying a building and 600,000.5 dollars getting himself a car."
>>> nums_from_string.get_numeric_string_tokens(string2)
['.25', '600,000.5']

>>> string3 = "Find the product of 4 and -5?"
>>> nums_from_string.get_numeric_string_tokens(string3)
['4', '-5']

>>> string4 = "The flight number is Airbus A330-300"
>>> nums_from_string.get_numeric_string_tokens(string4, no_minus=True)
['330', '300']
  1. Convert strings to numbers
>>> s0 = "255"
>>> nums_from_string.to_num(s0)
255

>>> s1 = "-255,000.0"
>>> nums_from_string.to_num(s1)
-255000.0

>>> s2 = "87/25"
>>> nums_from_string.to_num(s2)
Fraction(87, 25)

>>> s3 = "a1b2"
>>> nums_from_string.to_num(s3)
Traceback (most recent call last):
    ...
ValueError: Invalid numerical string!

Todo

  • Capture the pattern of fractions in a string
  • Capture the patterns like this "-3.5/11"

Reference

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nums_from_string-0.1.2-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file nums_from_string-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: nums_from_string-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for nums_from_string-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0f36bb38c6be5259290d445912f96760800b6dad2ea9d6b9046e708c8213c1b1
MD5 5c703246cda6fb752a09c8adeb29f5e1
BLAKE2b-256 77c64ffad07baaef7b00360c8b16d253c288c8170dd41d4b65a071abf41d4083

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page