A Fully Automated Risk & Trading Intelligence Engine

These details have not been verified by PyPI

Project links

Project description

🚀 AutoRiskML - The First Fully Automated Risk & Trading Intelligence Engine

The only Python package that acts like a Senior Risk Data Scientist

AutoRiskML automates the entire risk modeling pipeline from data ingestion to Azure deployment. Built for banks, fintechs, trading firms, and hedge funds.

🎯 Why AutoRiskML is Revolutionary

❌ The Problem

Risk data scientists spend 80% of their time on:

Manual data cleaning and binning
Computing WOE/IV tables
Monitoring PSI and drift
Building scorecards
Setting up model monitoring
Creating deployment pipelines

✅ The Solution: AutoRiskML

from autoriskml import AutoRisk

# ONE command does EVERYTHING a senior risk DS would do:
ar = AutoRisk(project="loan_scoring")
ar.register_source("train", csv="data/loans.csv")
result = ar.run(
    source="train",
    target="default_flag",
    explain=True,
    deploy={"provider": "azure_ml"}
)

# You now have:
# ✅ Data profile & recommendations
# ✅ Automated cleaning
# ✅ Optimal binning & WOE/IV tables
# ✅ Trained scorecard model
# ✅ PSI & drift monitoring
# ✅ SHAP explainability
# ✅ Production-ready Azure deployment
# ✅ PDF/HTML reports

🏆 Unique Features (No Other Package Has These)

Feature	Pandas	Scikit-learn	H2O	PyCaret	AutoRiskML
Auto WOE/IV	❌	❌	❌	❌	✅
Auto PSI	❌	❌	❌	❌	✅
Scorecard Generation	❌	❌	❌	❌	✅
Drift Detection for Trading	❌	❌	❌	❌	✅
Risk-specific Binning	❌	❌	Partial	❌	✅
Azure ML Auto-deploy	❌	❌	❌	❌	✅
Built-in Monitoring	❌	❌	❌	❌	✅
Audit Trail	❌	❌	❌	❌	✅
Pure Python	✅	✅	❌	✅	✅

🔥 What AutoRiskML Does

A. Automated Risk ML Pipeline

# 1. DATA PROFILING - Like a senior DS would analyze
ar.profile()
# → Column types, missing %, distributions, recommendations

# 2. AUTO-CLEANING - Handles all edge cases
ar.autoclean()
# → Missing values, outliers, type coercion, date parsing

# 3. FEATURE ENGINEERING - Risk-specific features
ar.auto_features()
# → Binning, WOE encoding, interaction features

# 4. MODEL TRAINING - Multiple algorithms
ar.train(models=["logistic", "xgboost", "lightgbm"])
# → Auto hyperparameter tuning, walk-forward validation

# 5. SCORECARD GENERATION - Convert to points
ar.scorecard(pdo=20, base_score=600)
# → Industry-standard credit scoring

B. Risk Scoring Engine (WOE/IV/PSI)

# Weight of Evidence & Information Value
woe_iv = ar.compute_woe_iv(feature="credit_utilization", target="default")
print(f"IV: {woe_iv['iv']:.3f}")  # Predictive power
print(woe_iv['woe_table'])         # Bin-level WOE

# Population Stability Index
psi = ar.compute_psi(
    baseline_data="train.csv",
    current_data="production_data.csv"
)
print(f"PSI: {psi:.3f}")  # <0.1: stable, >0.25: significant drift

# Characteristic Stability Index
csi = ar.compute_csi(feature="income", current_data="latest.csv")

C. Monitoring & Drift Detection

# Continuous monitoring
monitor = ar.monitor(
    production_data="s3://bucket/prod_scores.parquet",
    baseline="train.csv",
    alert_threshold=0.2
)

print(monitor.summary())
# → PSI per feature
# → Score distribution shift
# → Prediction drift
# → Retrain recommendations

D. Explainability (SHAP + Custom)

# Global explainability
ar.explain_global()
# → Top features driving risk
# → SHAP summary plots

# Local explainability (per-record)
explanation = ar.explain_record(customer_id=12345)
print(explanation.reason_codes)
# → "High credit utilization (+45 pts)"
# → "Recent late payments (+30 pts)"

E. Deployment to Azure

# One-command deployment
endpoint = ar.deploy(
    provider="azure_ml",
    workspace="RiskWS",
    resource_group="risk-rg",
    compute_type="aks",  # or "aci" for quick tests
    instance_count=3
)

print(f"Endpoint: {endpoint.scoring_uri}")
print(f"Key: {endpoint.primary_key}")

# Score new data via REST API
scores = endpoint.score(new_customers_df)

F. Backtesting (Trading Mode)

# Time-series walk-forward validation
backtest = ar.backtest(
    data="trading_signals.csv",
    strategy="long_short",
    walk_forward_windows=12,
    refit_frequency="monthly"
)

print(backtest.sharpe_ratio)
print(backtest.max_drawdown)
print(backtest.cumulative_returns)

G. Auto-Reporting

# Generate comprehensive reports
ar.report(
    output="risk_report.html",
    include=[
        "data_profile",
        "woe_iv_tables",
        "model_performance",
        "psi_monitoring",
        "shap_explanations",
        "scorecard",
        "recommendations"
    ]
)

# PDF for regulators
ar.report(output="regulatory_report.pdf", template="basel")

📦 Installation

Basic (Pure Python, zero dependencies)

pip install autoriskml

With Machine Learning

pip install autoriskml[ml]

With Explainability

pip install autoriskml[explain]

With Azure Deployment

pip install autoriskml[azure]

Full Installation (Everything)

pip install autoriskml[all]

🚀 Quick Start (30 Seconds)

Example 1: Credit Scoring

from autoriskml import AutoRisk

# Initialize
ar = AutoRisk(project="credit_scoring")

# Register data
ar.register_source("train", csv="loans_train.csv")
ar.register_source("test", csv="loans_test.csv")

# Run full pipeline
result = ar.run(
    source="train",
    validation_source="test",
    target="default_flag",
    config="configs/credit_config.yaml"
)

# Access artifacts
print(f"Model AUC: {result.metrics['auc']:.3f}")
print(f"Model PSI: {result.metrics['psi']:.3f}")
print(f"Scorecard: {result.scorecard_path}")
print(f"Report: {result.report_html}")

Example 2: Fraud Detection

ar = AutoRisk(project="fraud_detection")
ar.register_source("transactions", sql_query="""
    SELECT * FROM transactions 
    WHERE date >= '2024-01-01'
""", connection_string="postgresql://...")

result = ar.run(
    source="transactions",
    target="is_fraud",
    models=["logistic", "xgboost"],
    explain=True,
    monitor={"psi_threshold": 0.15}
)

Example 3: Trading Risk

ar = AutoRisk(project="trading_risk", mode="trading")
ar.register_source("signals", parquet="s3://bucket/signals.parquet")

result = ar.run(
    source="signals",
    target="return_next_day",
    backtest=True,
    walk_forward=True,
    deploy={"provider": "azure_ml"}
)

print(f"Sharpe Ratio: {result.backtest['sharpe']:.2f}")
print(f"Max Drawdown: {result.backtest['max_dd']:.2%}")

🎓 Complete Example: End-to-End Loan Scoring

from autoriskml import AutoRisk
import pandas as pd

# 1. Initialize project
ar = AutoRisk(
    project="personal_loans",
    output_dir="artifacts/loans",
    log_level="INFO"
)

# 2. Register data sources
ar.register_source("train", csv="data/loans_2022_2023.csv")
ar.register_source("valid", csv="data/loans_2024_Q1.csv")
ar.register_source("prod", s3="s3://bucket/prod/loans.parquet")

# 3. Profile data (optional but recommended)
profile = ar.profile(source="train")
print(profile.summary())
# → 50,000 rows × 45 features
# → Missing: income (5%), employment_length (12%)
# → Recommendations: 8 features to drop, 3 to engineer

# 4. Run full automated pipeline
result = ar.run(
    source="train",
    validation_source="valid",
    target="default_flag",
    
    # Cleaning options
    clean={
        "missing_strategy": "auto",  # smart imputation
        "outlier_method": "iqr",
        "date_formats": ["%Y-%m-%d", "%d/%m/%Y"]
    },
    
    # Binning options
    binning={
        "numeric_method": "monotonic",  # monotonic bad rate
        "max_bins": 6,
        "min_bin_size": 0.05
    },
    
    # Feature selection
    features={
        "min_iv": 0.02,  # minimum information value
        "max_features": 20,
        "auto_interactions": True
    },
    
    # Model options
    models=[
        {"type": "logistic", "penalty": 0.1},
        {"type": "xgboost", "params": {"max_depth": 6, "eta": 0.05}}
    ],
    
    # Scorecard conversion
    scorecard={
        "pdo": 20,        # points to double odds
        "base_score": 600,
        "base_odds": 50
    },
    
    # Explainability
    explain=True,
    
    # Monitoring
    monitor={
        "compute_psi": True,
        "psi_threshold": 0.2,
        "drift_features": "auto",
        "retrain_trigger": "drift_or_performance"
    },
    
    # Reporting
    report={
        "formats": ["html", "pdf"],
        "template": "executive"
    },
    
    # Deployment
    deploy={
        "provider": "azure_ml",
        "workspace": "RiskWS",
        "resource_group": "risk-prod-rg",
        "compute": "aks-cluster",
        "auth": "key"
    }
)

# 5. Access results
print("\n" + "="*70)
print("📊 RESULTS")
print("="*70)
print(f"✅ Model: {result.best_model}")
print(f"✅ AUC: {result.metrics['auc']:.3f}")
print(f"✅ KS: {result.metrics['ks']:.3f}")
print(f"✅ Gini: {result.metrics['gini']:.3f}")
print(f"✅ PSI (validation): {result.metrics['psi']:.3f}")
print(f"\n📁 Artifacts:")
print(f"   • Model: {result.model_path}")
print(f"   • Scorecard: {result.scorecard_path}")
print(f"   • Binning spec: {result.binning_spec_path}")
print(f"   • WOE tables: {result.woe_tables_path}")
print(f"   • Report: {result.report_html}")
print(f"\n🌐 Deployment:")
print(f"   • Endpoint: {result.endpoint.scoring_uri}")
print(f"   • Key: {result.endpoint.primary_key[:20]}...")

# 6. Score new customers
new_customers = pd.read_csv("data/new_applications.csv")
scores = ar.score(new_customers, output="with_reasons")

print(f"\n✅ Scored {len(scores)} new customers")
print(scores[['customer_id', 'score', 'probability', 'risk_tier', 'top_reason']].head())

# 7. Monitor production data
monitor_result = ar.monitor(source="prod")
if monitor_result.alert:
    print(f"\n⚠️  ALERT: {monitor_result.message}")
    print(f"   PSI: {monitor_result.psi:.3f} (threshold: 0.20)")
    print(f"   Drifted features: {', '.join(monitor_result.drifted_features)}")
    print(f"   Recommendation: {monitor_result.recommendation}")

📚 Advanced Features

Custom Binning Strategy

from autoriskml.binning import CustomBinner

class MyBinner(CustomBinner):
    def fit(self, values, target):
        # Your custom binning logic
        bins = self.compute_custom_bins(values, target)
        return bins

ar.register_binner("my_method", MyBinner())
result = ar.run(..., binning={"method": "my_method"})

Custom Model Adapter

from autoriskml.models import ModelAdapter

class MyModelAdapter(ModelAdapter):
    def train(self, X, y):
        # Train your model
        self.model = YourModel().fit(X, y)
    
    def predict_proba(self, X):
        return self.model.predict_proba(X)

ar.register_model("my_model", MyModelAdapter())

Streaming Scoring

# Score large datasets in chunks
for chunk_scores in ar.score_stream(
    source="s3://bucket/huge_file.csv",
    chunk_size=100_000,
    output="s3://bucket/scores/"
):
    print(f"Scored {len(chunk_scores)} records")

Real-time Monitoring

# Set up continuous monitoring
ar.monitor_continuously(
    source_stream="kafka://topic/transactions",
    baseline="train_data.csv",
    check_interval="hourly",
    alert_email="risk-team@company.com"
)

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        AutoRisk API                              │
│  (Simple high-level interface: run(), score(), monitor())       │
└────────────────────┬────────────────────────────────────────────┘
                     │
┌────────────────────┴────────────────────────────────────────────┐
│                    Core Pipeline Orchestrator                    │
│  • Stage execution  • Artifact management  • Provenance tracking │
└────┬───────┬────────┬─────────┬──────────┬───────────┬──────────┘
     │       │        │         │          │           │
     ▼       ▼        ▼         ▼          ▼           ▼
┌─────────┬──────┬────────┬─────────┬────────────┬──────────────┐
│Connector│Profil│Cleaning│ Binning │  Models    │   Scoring    │
│CSV/SQL/ │-ing  │Auto    │WOE/IV   │Logistic/   │  Scorecard   │
│S3/Kafka │      │Clean   │Monotonic│XGB/LightGBM│  Generation  │
└─────────┴──────┴────────┴─────────┴────────────┴──────────────┘
                                      │
                 ┌────────────────────┼─────────────────────┐
                 ▼                    ▼                     ▼
            ┌─────────┐         ┌──────────┐        ┌────────────┐
            │ Metrics │         │ Explain  │        │ Monitoring │
            │PSI/CSI/ │         │SHAP/LIME │        │Drift/Alert │
            │KS/Gini  │         │Reasons   │        │PSI Tracker │
            └─────────┘         └──────────┘        └────────────┘
                                      │
                 ┌────────────────────┴─────────────────────┐
                 ▼                                          ▼
            ┌─────────┐                               ┌──────────┐
            │ Export  │                               │Deployment│
            │ONNX/    │                               │Azure ML/ │
            │Joblib   │                               │AKS/API   │
            └─────────┘                               └──────────┘

🎯 Use Cases

1. Banks & Credit Unions

Personal loan scoring
Credit card approvals
Mortgage risk assessment
SME lending

2. Fintechs

BNPL (Buy Now Pay Later) scoring
Micro-lending
Alternative credit scoring
KYC risk assessment

3. Insurance

Claims fraud detection
Underwriting risk
Policyholder lifetime value

4. Trading Firms

Strategy risk monitoring
Position sizing
Counterparty risk
Market regime detection

5. E-commerce

Transaction fraud
Account takeover detection
Chargeback prediction

📊 Performance

Speed: 10x faster than manual process
Accuracy: Comparable to senior DS work
Scalability: Handles 100M+ records with distributed mode
Memory: Streaming support for datasets > RAM

🔒 Security & Compliance

✅ Local-first (no external calls by default)
✅ Audit trail for all transformations
✅ PII detection and scrubbing
✅ Explainable AI for regulatory compliance
✅ Reproducible pipelines (version control)
✅ GDPR-compliant data handling

📖 Documentation

🤝 Contributing

Contributions are welcome! See CONTRIBUTING.md

📝 License

MIT License - see LICENSE file

🙏 Acknowledgments

Built with inspiration from years of risk modeling in banking and fintech.

📧 Contact

Author: Idriss Bado
Email: idrissbadoolivier@gmail.com
GitHub: idrissbado

⭐ If AutoRiskML helps you, please star the repo!

🚀 Built for the future of automated risk intelligence

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Dec 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoriskml-0.1.0.tar.gz (26.1 kB view details)

Uploaded Dec 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autoriskml-0.1.0-py3-none-any.whl (17.5 kB view details)

Uploaded Dec 4, 2025 Python 3

File details

Details for the file autoriskml-0.1.0.tar.gz.

File metadata

Download URL: autoriskml-0.1.0.tar.gz
Upload date: Dec 4, 2025
Size: 26.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for autoriskml-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`22a51286dbf7f400c553b8ad5a60c10286aa33f5d3e268069773e63328ed3dba`
MD5	`f69e9d4a0b7c1d0c2181c9bde71a6565`
BLAKE2b-256	`9380a11edac558e8ca4a64f566cb0f840ab2b1889fd520a67582e931337e63b2`

See more details on using hashes here.

File details

Details for the file autoriskml-0.1.0-py3-none-any.whl.

File metadata

Download URL: autoriskml-0.1.0-py3-none-any.whl
Upload date: Dec 4, 2025
Size: 17.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for autoriskml-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`94883cbcad693693f3ca5719cfd9eebe556650f1d14bec9f7e68c8bec49bd8d8`
MD5	`99d4b59014b5c76c7c8b9b3e4e4fd1d7`
BLAKE2b-256	`43e25e1f0271332a055eaa989932d9adb72f4d985cdf3cdf80f95a5b874dfc80`

See more details on using hashes here.

autoriskml 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🚀 AutoRiskML - The First Fully Automated Risk & Trading Intelligence Engine

🎯 Why AutoRiskML is Revolutionary

❌ The Problem

✅ The Solution: AutoRiskML

🏆 Unique Features (No Other Package Has These)

🔥 What AutoRiskML Does

A. Automated Risk ML Pipeline

B. Risk Scoring Engine (WOE/IV/PSI)

C. Monitoring & Drift Detection

D. Explainability (SHAP + Custom)

E. Deployment to Azure

F. Backtesting (Trading Mode)

G. Auto-Reporting

📦 Installation

Basic (Pure Python, zero dependencies)

With Machine Learning

With Explainability

With Azure Deployment

Full Installation (Everything)

🚀 Quick Start (30 Seconds)

Example 1: Credit Scoring

Example 2: Fraud Detection

Example 3: Trading Risk

🎓 Complete Example: End-to-End Loan Scoring

📚 Advanced Features

Custom Binning Strategy

Custom Model Adapter

Streaming Scoring

Real-time Monitoring

🏗️ Architecture

🎯 Use Cases

1. Banks & Credit Unions

2. Fintechs

3. Insurance

4. Trading Firms

5. E-commerce

📊 Performance

🔒 Security & Compliance

📖 Documentation

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes