SamFair: Algorithmic bias audit and Post-Prediction Neural Linking library
Project description
SamFair
SamFair is an algorithmic bias audit and explainability library designed to meet the strict regulatory requirements of the upcoming DPDP Act and India AI Bill 2025.
It provides an end-to-end suite for auditing AI recruitment tools (AEDTs) and extracting readable logic from opaque AI models using our proprietary Post-Prediction Neural Linker (PPNL) algorithm.
🚀 Features
- Synthetic Golden Data Generation: Creates robust, culturally-aware synthetic candidate profiles containing intersectional protected attributes (e.g., Gender, Religion, Caste proxy).
- Adverse Impact Engine: Computes the 4/5ths Rule (80% Rule) across intersectional slices to detect hidden bias.
- Post-Prediction Neural Linker (PPNL): Mines surrogate decision tree rules that lead to rejection for flagged groups. Translates opaque predictions into actionable, human-readable logic (
IF feature X THEN reject). - Immutable Evidence Logging: Every audit and rule extraction is hashed (SHA-256) into a tamper-proof JSONL trail.
- Automated DPIA Reports: Generates ready-to-share PDF Data Protection Impact Assessments outlining the found biases and remediation steps.
📦 Installation
pip install samfair
💻 Quick Start
1. Generating Golden Sets
Generate a synthetic dataset of Indian candidate profiles with embedded protected attributes.
from samfair_lib.synthetic_data import generate_golden_set
df = generate_golden_set(n=1000)
print(df.head())
2. Bias Auditing (4/5ths Rule)
Pass your model's predictions into the audit engine to calculate adverse impact ratios.
from samfair_lib.audit import compute_adverse_impact
# df is your DataFrame containing protected columns
# predictions is a list/array of model outputs (1 for accept, 0 for reject)
protected_cols = ['gender', 'religion', 'caste_indicator']
results = compute_adverse_impact(df, predictions, protected_cols)
flagged_groups = results[results['flagged'] == True]
print(flagged_groups)
3. Explainability with PPNL
If biases are found, use PPNL to extract the logic leading to rejections for flagged intersectional groups.
from samfair_lib.ppnl import ppnl_explain
ppnl_output = ppnl_explain(df, predictions, flagged_groups)
print(f"Extracted Rule: {ppnl_output['rule']}")
print(f"Feature Contributions: {ppnl_output['feature_contributions']}")
4. Logging & Reports
Create tamper-proof audit trails and PDF reports.
from samfair_lib.evidence import log_evidence
from samfair_lib.reports import build_report
evidence_hash = log_evidence("AUDIT_RUN", {"flagged_count": len(flagged_groups)})
report_path = build_report(results, ppnl_output, evidence_hash)
🏗️ Architecture
SamFair operates entirely locally without sending data to third-party APIs. It uses lightweight Scikit-Learn surrogates for high-fidelity rule mining, ensuring you have total control over the audit workflow.
📄 License
MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file samfair-0.1.2.tar.gz.
File metadata
- Download URL: samfair-0.1.2.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcc1995faa77d7d86518f9fdefeb14d3a6f85955d1c7ed5c5ae8910bd08d9b4b
|
|
| MD5 |
864e619842464a521611ff17223a4bbf
|
|
| BLAKE2b-256 |
249837296c536e0bfe5ff60d11ccf330ced1a905ff22c8274e750ff944bc0aad
|
File details
Details for the file samfair-0.1.2-py3-none-any.whl.
File metadata
- Download URL: samfair-0.1.2-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3aec1dfa1ca99cf49aa8682e5bfc0af4585d8adacd2827fd2fb8c6e34e082bd7
|
|
| MD5 |
4741d65a1525b933981b6093df91712b
|
|
| BLAKE2b-256 |
2159974d49981b9dedd71f1cca27dccf40f2d82f809318369903fbd1bff4e700
|