Skip to main content

Train, save, and run fraud detection on transaction data. One class. Clean API.

Project description

fraud-shield

MIT License GitHub Python scikit-learn

Train, save, and run fraud detection on transaction data. One class. Clean API.

Built from a production Random Forest classifier for credit card fraud detection on imbalanced datasets. Handles the hard parts — class imbalance, balanced accuracy, probability calibration — so you don't have to.


Install

pip install fraud-shield

Or from source:

git clone https://github.com/iamadhitya1/fraud-shield
pip install -e fraud-shield/

Quick Start

from fraudshield import FraudDetector

# Train
detector = FraudDetector()
detector.train("transactions.csv", target_col="Class")
detector.save("fraud_model.pkl")

# Predict single transaction
result = detector.predict({
    "V1": -1.36, "V2": -0.07, "V3": 2.54, "Amount": 149.62
    # ... all feature columns
})

print(result.label)             # "FRAUD" or "LEGITIMATE"
print(result.fraud_probability) # 0.9423
print(result.confidence)        # "high"

Train

detector = FraudDetector(
    n_estimators=100,              # number of trees
    random_state=42,               # reproducibility
    high_confidence_threshold=0.80,
    low_confidence_threshold=0.40,
)

metrics = detector.train("creditcard.csv", target_col="Class", verbose=True)
# [fraud-shield] Training on 199364 samples...
# [fraud-shield] Training complete.
#   Balanced Accuracy : 0.9412
#   F1 Score (macro)  : 0.9318
#   ROC-AUC           : 0.9876

Compatible with: Kaggle Credit Card Fraud Detection dataset and any binary classification dataset with 0/1 labels.


Predict

Single transaction

result = detector.predict(transaction_dict)

result.is_fraud           # True / False
result.fraud_probability  # 0.0 – 1.0
result.confidence         # "high" / "medium" / "low"
result.label              # "FRAUD" / "LEGITIMATE"
result.to_dict()          # { is_fraud, fraud_probability, confidence, label }

Batch prediction

import pandas as pd

df = pd.read_csv("new_transactions.csv")
results_df = detector.predict_batch(df)

# Adds columns: fraud_probability, is_fraud, confidence, label
print(results_df[["Amount", "fraud_probability", "label"]].head())

Evaluate

metrics = detector.evaluate("test_data.csv", target_col="Class")

# Returns dict with:
# balanced_accuracy, precision_macro, recall_macro,
# f1_macro, roc_auc, confusion_matrix, classification_report

Feature Importances

top = detector.feature_importances(top_n=10)
print(top)
# V14    0.1821
# V17    0.1342
# V12    0.1089
# ...

Save & Load

# Save
detector.save("fraud_model.pkl")

# Load in another script
detector = FraudDetector.load("fraud_model.pkl")
result = detector.predict(transaction)

Why balanced accuracy?

Raw accuracy is misleading on fraud data — a model that predicts every transaction as legitimate achieves ~99.8% accuracy while catching zero fraud. fraud-shield uses balanced accuracy by default, which averages recall across both classes and penalizes models that ignore the minority class.


Dataset

The included example targets the Kaggle Credit Card Fraud Detection dataset:

  • 284,807 transactions
  • 492 fraud cases (0.17%)
  • Features: V1–V28 (PCA-anonymized), Amount, Time

License

MIT © 2025 M Adhitya

Built at Rewrite Labs — extracted from production ML research at IITRAM Ahmedabad.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fraud_shield-1.0.0.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fraud_shield-1.0.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file fraud_shield-1.0.0.tar.gz.

File metadata

  • Download URL: fraud_shield-1.0.0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for fraud_shield-1.0.0.tar.gz
Algorithm Hash digest
SHA256 3931ca68b3f0e87ea618486d83da0b7180dee84ebfbebcdb91971ee4c3849b0c
MD5 3cdd059dcc469747b920e1344a670610
BLAKE2b-256 7dd6796b7f8cd5b0d6ca45d4d4925b9888dc6d1a812b0ffad35f2e25d388ef2d

See more details on using hashes here.

File details

Details for the file fraud_shield-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: fraud_shield-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for fraud_shield-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eff73740d2cb22c2ceb38aaf2efe9c5b9069ab91fe1b346cfa98847a31c7f44b
MD5 056b664f3b0eeb243e03c82b33d34b1e
BLAKE2b-256 de69def82fe874213e9ad7f51da15c34c90d94e19ed52dc627ebd100409c1e99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page