Skip to main content

Provides a series of functions to make Benford's Law usage more convenient.

Project description

Benfords

Benfords provides a series of functions intended to make Benford's Law research and usage more convenient.

Usage

With Benfords you can conduct quick comparisons to Benfords law, create outputs like charts and .csv files, generate random Benford-variates, as well as calculate the expected probabilities or extract your own empirical digit frequencies.

Comparisons to Benford's Law

The benfords() function allows you to quickly compare your data's digits to Benford's Law:

import benfords

#Generate some random data
test_data = variate(1000)

#Compare to Benford's Law
benfords(test_data)
#  Digit  Expected Value  Actual Value  Difference
#0     1        0.301030         0.316    0.014970
#1     2        0.176091         0.178    0.001909
#2     3        0.124939         0.113   -0.011939
#3     4        0.096910         0.097    0.000090
#4     5        0.079181         0.075   -0.004181
#5     6        0.066947         0.072    0.005053
#6     7        0.057992         0.059    0.001008
#7     8        0.051153         0.041   -0.010153
#8     9        0.045757         0.049    0.003243

Use test() to calculate test statistics:

test(benfords(test_data), test_statistic='d')
# 0.01914657325117365

Currently, test() supports only the d statistic described in Cho and Gaines, 2012. Future releases will include additional test statistics.

benfords() can also output charts and csv:

benfords(test_data, output_csv=True, output_plot=True, filename='2021-01-31 Analysis')

Figure showing expected and theoretical digit frequencies

There are also parmeters for examining the digits beyond the first, as well as multiple digits at a time. This example uses the second and third digits:

benfords(test_data, start_position=2, length=2)
#   Digit  Expected Value  Actual Value  Difference
#0      0        0.119679         0.000   -0.119679
#1      1        0.113890         0.000   -0.113890
#2      2        0.108821         0.000   -0.108821
#3      3        0.104330         0.000   -0.104330
#4      4        0.100308         0.000   -0.100308
#..   ...             ...           ...         ...
#95    95        0.027760         0.002   -0.025760
#96    96        0.027558         0.009   -0.018558
#97    97        0.027358         0.005   -0.022358
#98    98        0.027162         0.004   -0.023162
#99    99        0.026969         0.007   -0.019969
#
#[100 rows x 4 columns]

Generate random Benford-distributed digits.

You can generate random Benford-distributed digits with the variate() function. Just specify how many you want:

variate(5)

# [6.494198781949683, 5.511615661880242, 7.311726835973362, 1.6809486480388234, 8.877345103827716]

Variates are generated according to the method in Jamain, 2001.

Expected Probabilities

Calculate the probabilities expected under Benford's Law with expectation(). What's the expected probability that the second digit will be 5?

expectation(5, 2)

# 0.09667723580232242

Extract the significant digits from your data

fsd() and nsd() return the significant digits of your input data. They currently enjoy scalars (integers and floats), lists, 1d numpy arrays, and pandas dataframes.

fsd() returns the first significant digit according to the significand formula provided by Berger and Hill, 2015.

test = [5, 0.321, -2989.2, -0.00001]

fsd(test) 
#array([5., 3., 2., 1.])

#NSD() returns the nth digit of your data. It always returns a numpy array containing strings. It includes parameters to select the second, third, and higher digits, as well as control the number of digits. This example shows the 4th significant digits.

nsd(test, 4) 
#array(['0', '1', '8', '0']

Citation

If you use this work in your own research, please cite it in your publications:

McCarville, Daniel. Benford's Law, (2021). https://github.com/danielmccarville/Benfords

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benfords-1.0.2.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

benfords-1.0.2-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file benfords-1.0.2.tar.gz.

File metadata

  • Download URL: benfords-1.0.2.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for benfords-1.0.2.tar.gz
Algorithm Hash digest
SHA256 c48b6945aef8f5ccec7e67a54ca37f42f8b231dd0723fbc48dd7c789f414ef50
MD5 7ee9ed8a972d79ca69c30300aaa7a5ed
BLAKE2b-256 288fe6ece283188ac05fd64b597028cadfce8a518300bfe9a60b5b105317d6d5

See more details on using hashes here.

File details

Details for the file benfords-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: benfords-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for benfords-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5a5e7728375aca4230aaefb5a4eed27bfed978c5e2c85c60216ab1f8aedd66e6
MD5 dfa02b6805f061e0860954a174eabe0a
BLAKE2b-256 37edbfe9f3e352029eeb89b4c4f50b978097cbb83437ea9c3fca9e7c0f4d6d42

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page