Professional-grade Benford's Law analysis toolkit for forensic accounting, auditing, and fraud detection

These details have not been verified by PyPI

Project links

Project description

pybenford

Professional-grade Benford's Law analysis toolkit for forensic accounting, auditing, and fraud detection.

Why pybenford?

Existing Benford's Law packages on PyPI cover first-digit chi-square and not much else. Most are unmaintained. pybenford implements the complete Nigrini forensic accounting workflow from Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection (Nigrini, 2012):

MAD conformity classification with Nigrini's empirical thresholds (close, acceptable, marginally acceptable, nonconformity)
Distortion factor model for detecting overstatement vs. understatement
Second-order test on differences of sorted values
Summation test with uniform 1/90 expectation
Mantissa arc test (Alexander, 2009) with L-squared statistic
Number duplication analysis
All standard digit tests: first, second, third, first-two, first-three, last-two
Z-statistic with Fleiss continuity correction, chi-square, Kolmogorov-Smirnov
Publication-quality matplotlib visualizations
Pure NumPy internals, no pandas dependency

Installation

pip install pybenford

Quick Start

from pybenford import BenfordAnalysis

analysis = BenfordAnalysis(data)  # list, numpy array, or pandas/polars Series
result = analysis.first_digit()
print(result)

Every result object has a formatted print() output. Three lines from raw data to a conformity report:

=======================================================
  First Digit Test  (n=3,195  alpha=0.05)
=======================================================
 Digit   Count   Observed   Expected   Z-Score   Sig
     1    956   29.92%    30.10%      0.20
     2    595   18.62%    17.61%      1.48
     3    389   12.18%    12.49%      0.52
     4    299    9.36%     9.69%      0.61
     5    255    7.98%     7.92%      0.10
     6    197    6.17%     6.69%      1.16
     7    180    5.63%     5.80%      0.36
     8    171    5.35%     5.12%      0.57
     9    153    4.79%     4.58%      0.53
-------------------------------------------------------
 MAD:        0.0034 — Close Conformity
 Chi-Square: 4.6922  (critical: 15.5073) — Pass
 KS:         0.0083  (critical: 0.0240)  — Pass
=======================================================

For tests with many digit bins (first-two, first-three), the display shows only flagged digits instead of all 90 or 900 rows:

=======================================================
  First Two Digits Test  (n=3,195  alpha=0.05)
=======================================================
 Flagged Digits (7 of 90):
 Digit   Count   Observed   Expected   Z-Score
    35     24    0.75%     1.22%      2.35  *
    49     16    0.50%     0.88%      2.19  *
    66     33    1.03%     0.65%      2.56  *
    70     29    0.91%     0.62%      1.99  *
    75     28    0.88%     0.58%      2.13  *
    76      9    0.28%     0.57%      2.03  *
    77      9    0.28%     0.56%      1.99  *
-------------------------------------------------------
 MAD:        0.0015 — Acceptable Conformity
 Chi-Square: 104.9157  (critical: 112.0220) — Pass
 KS:         0.0102  (critical: 0.0240)  — Pass
=======================================================

All results are also accessible programmatically:

result = analysis.first_digit()

result.mad                     # 0.0034
result.mad_conformity          # "close"
result.chi_square              # 4.6922
result.chi_square_significant  # False
result.ks_statistic            # 0.0083
result.ks_critical             # 0.0240
result.z_scores                # array of per-digit Z-scores
result.significant_flags       # bool array of flagged digits
result.observed                # array of observed proportions
result.expected                # array of expected Benford proportions
result.digits                  # array of digit labels
result.counts                  # array of raw counts
result.n                       # number of records analyzed
result.alpha                   # significance level used
result.test_name               # e.g. "First Digit Test"

Demo Notebook

A complete walkthrough of every test and visualization is available in examples/demo.ipynb. It runs against US Census county population data and shows the output of all 11 tests, 6 plot functions, and programmatic result access.

Data Preparation

analysis = BenfordAnalysis(
    data,                        # list, array, or Series of numbers
    sign_filter="positive",      # "all", "positive", or "negative"
    min_abs_value=10.0,          # exclude small values (optional)
    drop_zero=True,              # exclude zeros (default: True)
)

print(analysis.profile)          # data profile per Nigrini Ch. 4

sign_filter separates income from expense items for independent analysis. min_abs_value excludes values below a minimum magnitude, since very small numbers distort digit distributions.

Visualization

Plot functions return (Figure, Axes) with no side effects.

from pybenford.visualization import plot_digit_test, plot_mantissa_arc

result = analysis.first_two_digits()
fig, ax = plot_digit_test(result, show_confidence=True)
fig.savefig("first_two_digits.png", dpi=150)

arc = analysis.mantissa_arc()
fig, ax = plot_mantissa_arc(arc)

from pybenford.visualization import plot_z_scores
fig, ax = plot_z_scores(result, critical_value=1.96)

Available Tests

Method	Description	Reference
`first_digit()`	First significant digit (1-9)	Nigrini Ch. 5
`second_digit()`	Second significant digit (0-9)	Nigrini Ch. 5
`third_digit()`	Third significant digit (0-9)	Nigrini Ch. 5
`first_two_digits()`	First two digits (10-99)	Nigrini Ch. 5
`first_three_digits()`	First three digits (100-999)	Nigrini Ch. 5
`last_two_digits()`	Last two digits (00-99), uniform expected	Nigrini Ch. 5
`second_order()`	Digit test on sorted differences	Nigrini Ch. 6
`summation()`	Sum proportions vs. uniform 1/90	Nigrini Ch. 5
`distortion_factor()`	Overstatement/understatement detection	Nigrini Ch. 6
`mantissa_arc()`	Uniformity of mantissas on unit circle	Nigrini Ch. 7
`number_duplication()`	Most frequently duplicated values	Nigrini Ch. 5

Statistical Measures

Each digit test result includes:

Z-statistic per digit bin (Fleiss continuity correction)
Chi-square goodness-of-fit with critical value
Kolmogorov-Smirnov statistic with critical value
MAD (Mean Absolute Deviation) with Nigrini's conformity classification
Per-bin significance flags at configurable alpha

MAD Conformity Thresholds

MAD is the preferred conformity measure because chi-square and KS become overly sensitive with large datasets (N > 25,000), rejecting near-perfect conformity. MAD is sample-size independent.

Test	Close	Acceptable	Marginal	Nonconformity
First digit	< 0.006	< 0.012	< 0.015	>= 0.015
Second digit	< 0.008	< 0.010	< 0.012	>= 0.012
First two digits	< 0.0012	< 0.0018	< 0.0022	>= 0.0022
First three digits	< 0.00036	< 0.00044	< 0.00050	>= 0.00050

References

Nigrini, M.J. (2012). Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection. Wiley.
Miller, S.J. (2015). Benford's Law: Theory and Applications. Princeton University Press.
Kossovsky, A.E. (2014). Benford's Law: Theory, the General Law of Relative Quantities, and Forensic Fraud Detection Applications. World Scientific.

License

MIT License. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

May 7, 2026

0.1.1

May 5, 2026

0.1.0

Apr 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybenford-0.1.2.tar.gz (1.3 MB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pybenford-0.1.2-py3-none-any.whl (33.0 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file pybenford-0.1.2.tar.gz.

File metadata

Download URL: pybenford-0.1.2.tar.gz
Upload date: May 7, 2026
Size: 1.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for pybenford-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`71a0f37a1b7d7dd5f861796c03f6e55346fc7e8d32fdd0c46ebc4a0752d10261`
MD5	`3106fa3f81fc273c70c92c2d99a99ef1`
BLAKE2b-256	`9f2b0b5dcc9d686883e0609f01fc3d479a904cadc225314b280e403443acd0a6`

See more details on using hashes here.

File details

Details for the file pybenford-0.1.2-py3-none-any.whl.

File metadata

Download URL: pybenford-0.1.2-py3-none-any.whl
Upload date: May 7, 2026
Size: 33.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for pybenford-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`82e2c7d9c3cfbac29da1fb6b59f026365164c06442c864cc0a91b7d1169555fd`
MD5	`3a2a29a302d5b91ecde03bc1ec885618`
BLAKE2b-256	`61bf5e70fd05b76abfbf54d8a12d711a7be2b7e97d1e9d3492019210a47e19bf`

See more details on using hashes here.

pybenford 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pybenford

Why pybenford?

Installation

Quick Start

Demo Notebook

Data Preparation

Visualization

Available Tests

Statistical Measures

MAD Conformity Thresholds

References

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes