Skip to main content

SamFair: Algorithmic bias audit and Post-Prediction Neural Linking library

Project description

SamFair

SamFair is an algorithmic bias audit and explainability library designed to meet the strict regulatory requirements of the upcoming DPDP Act and India AI Bill 2025.

It provides an end-to-end suite for auditing AI recruitment tools (AEDTs) and extracting readable logic from opaque AI models using our proprietary Post-Prediction Neural Linker (PPNL) algorithm.

🚀 Features

  • Synthetic Golden Data Generation: Creates robust, culturally-aware synthetic candidate profiles containing intersectional protected attributes (e.g., Gender, Religion, Caste proxy).
  • Adverse Impact Engine: Computes the 4/5ths Rule (80% Rule) across intersectional slices to detect hidden bias.
  • Post-Prediction Neural Linker (PPNL): Mines surrogate decision tree rules that lead to rejection for flagged groups. Translates opaque predictions into actionable, human-readable logic (IF feature X THEN reject).
  • Immutable Evidence Logging: Every audit and rule extraction is hashed (SHA-256) into a tamper-proof JSONL trail.
  • Automated DPIA Reports: Generates ready-to-share PDF Data Protection Impact Assessments outlining the found biases and remediation steps.

📦 Installation

pip install samfair

💻 Quick Start

1. Generating Golden Sets

Generate a synthetic dataset of Indian candidate profiles with embedded protected attributes.

from samfair_lib.synthetic_data import generate_golden_set

df = generate_golden_set(n=1000)
print(df.head())

2. Bias Auditing (4/5ths Rule)

Pass your model's predictions into the audit engine to calculate adverse impact ratios.

from samfair_lib.audit import compute_adverse_impact

# df is your DataFrame containing protected columns
# predictions is a list/array of model outputs (1 for accept, 0 for reject)
protected_cols = ['gender', 'religion', 'caste_indicator']

results = compute_adverse_impact(df, predictions, protected_cols)
flagged_groups = results[results['flagged'] == True]
print(flagged_groups)

3. Explainability with PPNL

If biases are found, use PPNL to extract the logic leading to rejections for flagged intersectional groups.

from samfair_lib.ppnl import ppnl_explain

ppnl_output = ppnl_explain(df, predictions, flagged_groups)
print(f"Extracted Rule: {ppnl_output['rule']}")
print(f"Feature Contributions: {ppnl_output['feature_contributions']}")

4. Logging & Reports

Create tamper-proof audit trails and PDF reports.

from samfair_lib.evidence import log_evidence
from samfair_lib.reports import build_report

evidence_hash = log_evidence("AUDIT_RUN", {"flagged_count": len(flagged_groups)})
report_path = build_report(results, ppnl_output, evidence_hash)

🏗️ Architecture

SamFair operates entirely locally without sending data to third-party APIs. It uses lightweight Scikit-Learn surrogates for high-fidelity rule mining, ensuring you have total control over the audit workflow.

📄 License

MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samfair-0.1.2.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

samfair-0.1.2-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file samfair-0.1.2.tar.gz.

File metadata

  • Download URL: samfair-0.1.2.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for samfair-0.1.2.tar.gz
Algorithm Hash digest
SHA256 dcc1995faa77d7d86518f9fdefeb14d3a6f85955d1c7ed5c5ae8910bd08d9b4b
MD5 864e619842464a521611ff17223a4bbf
BLAKE2b-256 249837296c536e0bfe5ff60d11ccf330ced1a905ff22c8274e750ff944bc0aad

See more details on using hashes here.

File details

Details for the file samfair-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: samfair-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for samfair-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3aec1dfa1ca99cf49aa8682e5bfc0af4585d8adacd2827fd2fb8c6e34e082bd7
MD5 4741d65a1525b933981b6093df91712b
BLAKE2b-256 2159974d49981b9dedd71f1cca27dccf40f2d82f809318369903fbd1bff4e700

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page