Professional-grade Benford's Law analysis toolkit for forensic accounting, auditing, and fraud detection
Project description
pybenford
Professional-grade Benford's Law analysis toolkit for forensic accounting, auditing, and fraud detection.
Why pybenford?
Existing Benford's Law packages on PyPI are either basic (first-digit chi-square only), outdated, or unmaintained. pybenford implements the complete Nigrini forensic accounting workflow as described in Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection (Nigrini, 2012):
- MAD conformity classification with Nigrini's thresholds: close, acceptable, marginally acceptable, nonconformity
- Distortion factor model detecting overstatement vs. understatement
- Second-order test on differences of sorted values
- Summation test with uniform 1/90 expectation
- Mantissa arc test (Alexander, 2009) with L-squared statistic
- Number duplication analysis
- All standard digit tests: first, second, third, first-two, first-three, last-two
- Z-statistic with Fleiss continuity correction, chi-square, Kolmogorov-Smirnov
- Publication-quality matplotlib visualizations
- Pure NumPy internals — no pandas dependency, fast on large datasets
Installation
pip install pybenford
Quick Start
from pybenford import BenfordAnalysis
# Load your data (list, numpy array, or pandas/polars Series)
data = [...] # e.g., invoice amounts, population figures, financial values
# Create analysis object — cleaning happens automatically
analysis = BenfordAnalysis(data, sign_filter="positive", min_abs_value=10.0)
# View data profile (Nigrini Ch. 4)
print(analysis.profile)
# Run the first-digit test
result = analysis.first_digit()
print(f"MAD: {result.mad:.6f} ({result.mad_conformity})")
print(f"Chi-square: {result.chi_square:.2f} (significant: {result.chi_square_significant})")
# Run the first-two digits test (most useful for forensic work)
result = analysis.first_two_digits()
# Run all advanced tests
summation = analysis.summation()
second_order = analysis.second_order()
distortion = analysis.distortion_factor()
mantissa = analysis.mantissa_arc()
duplicates = analysis.number_duplication(top_n=20)
Visualization
Every plot function returns (Figure, Axes) — no side effects, full control.
from pybenford.visualization import plot_digit_test, plot_mantissa_arc
# Digit distribution with confidence bands
result = analysis.first_two_digits()
fig, ax = plot_digit_test(result, show_confidence=True)
fig.savefig("first_two_digits.png", dpi=150)
# Mantissa arc test
arc = analysis.mantissa_arc()
fig, ax = plot_mantissa_arc(arc, analysis.clean_data)
# Z-scores with significance thresholds
from pybenford.visualization import plot_z_scores
fig, ax = plot_z_scores(result, critical_value=1.96)
Available Tests
| Method | Description | Reference |
|---|---|---|
first_digit() |
First significant digit (1-9) | Nigrini Ch. 5 |
second_digit() |
Second significant digit (0-9) | Nigrini Ch. 5 |
third_digit() |
Third significant digit (0-9) | Nigrini Ch. 5 |
first_two_digits() |
First two digits (10-99) | Nigrini Ch. 5 |
first_three_digits() |
First three digits (100-999) | Nigrini Ch. 5 |
last_two_digits() |
Last two digits (00-99), uniform expected | Nigrini Ch. 5 |
second_order() |
Digit test on sorted differences | Nigrini Ch. 6 |
summation() |
Sum proportions vs. uniform 1/90 | Nigrini Ch. 5 |
distortion_factor() |
Overstatement/understatement detection | Nigrini Ch. 6 |
mantissa_arc() |
Uniformity of mantissas on unit circle | Nigrini Ch. 7 |
number_duplication() |
Most frequently duplicated values | Nigrini Ch. 5 |
Statistical Measures
Each digit test result includes:
- Z-statistic per digit bin (Fleiss continuity correction)
- Chi-square goodness-of-fit with critical value
- Kolmogorov-Smirnov statistic with critical value
- MAD (Mean Absolute Deviation) with Nigrini's conformity classification
- Per-bin significance flags at configurable alpha
References
- Nigrini, M.J. (2012). Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection. Wiley.
- Miller, S.J. (2015). Benford's Law: Theory and Applications. Princeton University Press.
- Kossovsky, A.E. (2014). Benford's Law: Theory, the General Law of Relative Quantities, and Forensic Fraud Detection Applications. World Scientific.
License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pybenford-0.1.0.tar.gz.
File metadata
- Download URL: pybenford-0.1.0.tar.gz
- Upload date:
- Size: 89.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
633381d391f9a9e757651bdab6857fd52bce1eaccea0fba3d2c294dbd65e45d4
|
|
| MD5 |
cfe5bf6c2b57d1cb6d21223a7944bc12
|
|
| BLAKE2b-256 |
3e421374897a8e547b3c8b9e2210f048a20efec4406171b5bc8b34c28b6571a6
|
File details
Details for the file pybenford-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pybenford-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
939a4f711fb1916893f2a6307c5664fd7fb2905f34349bfd94e7a1698090174a
|
|
| MD5 |
fbad452dd9a73b858ff3ece7b491b73f
|
|
| BLAKE2b-256 |
e088003494d051833a043131c1c854fc0bd98faabc1147a24c544c93dcfd742a
|