benfordslaw is to test if an empirical (observed) distribution differs significantly from a theoretical (expected, Benfords) distribution.

These details have not been verified by PyPI

Project links

Project description

benfordslaw

benfordslaw is Python package to test if an empirical (observed) distribution differs significantly from a theoretical (expected, Benfords) distribution. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small. This method can be used if you want to test whether your set of numbers may be artificial (or manipulated). If a certain set of values follows Benford's Law then model's for the corresponding predicted values should also follow Benford's Law. Normal data (Unmanipulated) does trend with Benford's Law, whereas Manipulated or fraudulent data does not.
Assumptions of the data:
1. The numbers need to be random and not assigned, with no imposed minimums or maximums.
2. The numbers should cover several orders of magnitude
3. Dataset should preferably cover at least 1000 samples. Though Benford's law has been shown to hold true for datasets containing as few as 50 numbers.

Installation

Install benfordslaw from PyPI (recommended). benfordslaw is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
It is distributed under the MIT license.

Installation

pip install benfordslaw

Alternatively, install benfordslaw from the GitHub source:

git clone https://github.com/erdogant/benfordslaw.git
cd benfordslaw
pip install -U .

Import benfordslaw package

from benfordslaw import benfordslaw

# Initialize
bl = benfordslaw(alpha=0.05)

# Load elections example
df = bl.import_example(data='USA')

# Extract election information.
X = df['votes'].loc[df['candidate']=='Donald Trump'].values

# Print
print(X)
# array([ 5387, 23618,  1710, ...,    16,    21,     0], dtype=int64)

# Make fit
results = bl.fit(X)

# Plot
bl.plot(title='Donald Trump')

Analyze second digit-distribution

from benfordslaw import benfordslaw

# Initialize and set to analyze the second digit postion
bl = benfordslaw(pos=2)
# USA example
df = bl.import_example(data='USA')
results = bl.fit(X)
# Plot
bl.plot(title='Donald Trump', barcolor=[0.5, 0.5, 0.5], fontsize=12, barwidth=0.4)

Analyze last digit-distribution

from benfordslaw import benfordslaw

# Initialize and set to analyze the last postion
bl = benfordslaw(pos=-1)
# USA example
df = bl.import_example(data='USA')
results = bl.fit(X)
# Plot
bl.plot(title='Donald Trump', barcolor=[0.5, 0.5, 0.5], fontsize=12, barwidth=0.4)

Analyze second last digit-distribution

from benfordslaw import benfordslaw

# Initialize and set to analyze the last postion
bl = benfordslaw(pos=-2)
# USA example
df = bl.import_example(data='USA')
results = bl.fit(X)
# Plot
bl.plot(title='Donald Trump', barcolor=[0.5, 0.5, 0.5], fontsize=12, barwidth=0.4)

Citation

Please cite benfordslaw in your publications if this is useful for your research.

References

Maintainer

Erdogan Taskesen, github: erdogant
This work is created and maintained in my free time. If you wish to buy me a Coffee for this work, it is very appreciated.
Contributions are welcome.
Star it if you like it!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.2

Jan 27, 2023

1.2.1

Oct 14, 2022

1.2.0

Aug 22, 2022

1.1.3

Aug 1, 2022

1.1.2

Aug 1, 2022

1.1.1

Jun 5, 2022

1.1.0

Mar 11, 2022

1.0.5

Nov 30, 2021

This version

1.0.4

Oct 17, 2021

1.0.3

Oct 17, 2021

1.0.2

Jan 4, 2021

1.0.1

Sep 27, 2020

1.0.0

Sep 23, 2020

0.1.3

Aug 14, 2020

0.1.2

Feb 15, 2020

0.1.1

Feb 9, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benfordslaw-1.0.4.tar.gz (8.5 kB view hashes)

Uploaded Oct 17, 2021 Source

Built Distribution

benfordslaw-1.0.4-py3-none-any.whl (9.5 kB view hashes)

Uploaded Oct 17, 2021 Python 3

Hashes for benfordslaw-1.0.4.tar.gz

Hashes for benfordslaw-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`dda422b29a8d228e2857026309eb6c818488e5d2d2582ad8171aa3754bfcac7a`
MD5	`b74d35dea278c84b1cd1eb45258ab10b`
BLAKE2b-256	`84fab7fca2f79df477d1775e0202edf1a46ac8dc67a3162ebc2c6ce7f7024b64`

Hashes for benfordslaw-1.0.4-py3-none-any.whl

Hashes for benfordslaw-1.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`81122bbd82424667a2fec52d601b65336c080ca20b619b32804cb211e9f136de`
MD5	`20cf22795d0cdf0d5612bbad03e86b8e`
BLAKE2b-256	`c7ad20bf56127de4620818fd2eaf637815009acb36e6ec820e56a1a94308b158`