Skip to main content

Production-ready automated ML scoring engine.

Project description

ScoreFlow

ScoreFlow is a production-ready automated machine learning scoring engine. It takes raw tabular data, trains and evaluates multiple classifiers, selects the best model by AUC and stability, and exposes it as a REST API or exportable batch pipeline — all from a single Python package.


Table of Contents


Features

Capability Details
Multi-model training Logistic Regression, Random Forest, LightGBM (pluggable interface)
Rigorous evaluation AUC-ROC, KS statistic, lift at deciles, PSI stability
Automatic selection Best model chosen by AUC then KS as a tie-breaker
REST API FastAPI app with /health and /score endpoints
Batch pipeline Export any trained model to a self-contained folder + run it on CSV/Parquet
Structured logging JSON-line logs with request IDs, model version, and timing
Reproducibility Fixed random seeds, versioned artifacts with timestamps
Full test suite 50 tests — unit, API, pipeline, and end-to-end integration

Architecture

Raw CSV/Parquet
      │
      ▼
┌─────────────┐    schema, splits
│  Data layer │ ──────────────────────────────────────────────┐
└─────────────┘                                               │
      │                                                       │
      ▼                                                       ▼
┌──────────────┐  ModelArtifact   ┌────────────────┐   ┌───────────────┐
│ Model Trainer│ ────────────────▶│   Evaluator    │──▶│ Model Selector│
└──────────────┘                  └────────────────┘   └───────┬───────┘
                                                               │ best artifact
                                  ┌────────────────┐           │
                                  │  Scoring API   │◀──────────┤
                                  │   (FastAPI)    │           │
                                  └────────────────┘           │
                                                               ▼
                                                    ┌──────────────────┐
                                                    │ Pipeline Exporter │
                                                    │  (offline runner) │
                                                    └──────────────────┘

Package Layout

ScoreFlow/
├── scoreflow/
│   ├── config.py               # Paths, seeds, global defaults
│   ├── logging_config.py       # Structured JSON logging setup
│   ├── __main__.py             # CLI: `python -m scoreflow`
│   ├── data/
│   │   ├── schema.py           # DatasetSchema — feature/target definitions
│   │   ├── loaders.py          # CSV/Parquet loaders with validation
│   │   └── splitters.py        # Train / val / test split logic
│   ├── models/
│   │   ├── base.py             # Abstract BaseModel + ModelArtifact dataclass
│   │   ├── logistic.py         # Logistic Regression wrapper
│   │   ├── random_forest.py    # Random Forest wrapper
│   │   ├── lightgbm_model.py   # LightGBM wrapper
│   │   ├── trainer.py          # Trainer orchestrator
│   │   └── registry.py        # Hyperparameter registry
│   ├── evaluation/
│   │   ├── metrics.py          # AUC, KS, lift, PSI
│   │   ├── evaluator.py        # Evaluator — runs all metrics, returns report
│   │   └── selection.py        # select_best_model(), persist_evaluation_report()
│   ├── api/
│   │   └── app.py              # FastAPI app factory (create_app)
│   └── pipeline/
│       ├── exporter.py         # export_pipeline() — serialize artifact to folder
│       └── runner.py           # run_pipeline() — load export, score a file
├── tests/
│   ├── test_config.py
│   ├── test_data.py
│   ├── test_models.py
│   ├── test_evaluation.py
│   ├── test_api.py
│   ├── test_pipeline.py
│   └── test_integration.py     # Full end-to-end flow
├── pyproject.toml
├── Makefile
└── ROADMAP.md

Installation

Requirements: Python ≥ 3.11

# 1. Clone the repository
git clone https://github.com/your-org/scoreflow.git
cd scoreflow

# 2. Install with development dependencies
pip install -e ".[dev]"

Core dependencies installed automatically: pandas, scikit-learn, numpy, lightgbm, xgboost, pyarrow, scipy, fastapi, uvicorn, joblib.


Quickstart

1 — Train models on your data

import pandas as pd
from scoreflow.data.schema import DatasetSchema
from scoreflow.data.splitters import random_split
from scoreflow.models.trainer import Trainer

df = pd.read_csv("data/my_dataset.csv")

schema = DatasetSchema(target="default")   # name of your binary target column
schema.resolve(df)                         # auto-detect feature columns

splits = random_split(df, target="default", test_size=0.2, val_size=0.1)

trainer = Trainer(schema)                  # trains LR, RF, LightGBM by default
artifacts = trainer.train(splits)
trainer.save_artifacts(artifacts, "models/")

2 — Evaluate and select the best model

from scoreflow.evaluation import Evaluator, select_best_model, persist_evaluation_report

evaluator = Evaluator(schema)
reports = [evaluator.evaluate(a, splits) for a in artifacts]

best = select_best_model(reports, split="val")
print(f"Best model: {best.model_name}  AUC={best.metrics['val']['auc']:.4f}")

persist_evaluation_report(best, "reports/")

3 — Start the scoring API

python -m scoreflow serve --model-dir models/
# Health check
curl http://localhost:8000/health

# Score a record
curl -X POST http://localhost:8000/score \
  -H "Content-Type: application/json" \
  -d '{"records": [{"feature_a": 1.2, "feature_b": 0.5}]}'

4 — Export and run as a batch pipeline

# Export the best model to a portable folder
python -c "
from scoreflow.pipeline.exporter import export_pipeline
from scoreflow.models.base import ModelArtifact
a = ModelArtifact.load('models/logistic_regression')
from scoreflow.data.schema import DatasetSchema
import json, pathlib
schema_data = json.loads((pathlib.Path('models/logistic_regression/schema.json')).read_text())
# (or pass schema from training)
"

# Or use the CLI runner on an exported pipeline
python -m scoreflow run \
  --pipeline exports/logistic_regression \
  --input new_data.csv \
  --output scores.csv

Modules In Depth

Data Ingestion

scoreflow/data/schema.pyDatasetSchema

Holds the target column name, optional ID and date columns, and the resolved list of numeric feature columns. Call schema.resolve(df) once to auto-detect features.

schema = DatasetSchema(
    target="default",
    id_col="customer_id",      # optional, excluded from features
    date_col="application_date" # optional, excluded from features
)
schema.resolve(df)
print(schema.features)  # ['age', 'income', 'debt_ratio', ...]

scoreflow/data/loaders.py

from scoreflow.data.loaders import load_dataset

df = load_dataset("data/train.csv")          # CSV
df = load_dataset("data/train.parquet")      # Parquet

Validates: missing rates, data types, required columns.

scoreflow/data/splitters.py

from scoreflow.data.splitters import random_split, time_split

splits = random_split(df, target="default", test_size=0.2, val_size=0.1, seed=42)
# splits.train / splits.val / splits.test

Model Training

All models implement a common BaseModel interface: fit(X, y), predict_proba(X), get_params().

Model Class Notes
Logistic Regression LogisticModel L2 regularised, fast baseline
Random Forest RandomForestModel 200 trees, bagging ensemble
LightGBM LightGBMModel Gradient boosting, high accuracy

Trainer orchestrator

from scoreflow.models.trainer import Trainer

# Train specific models only
trainer = Trainer(schema, model_names=["logistic_regression", "lightgbm"])
artifacts = trainer.train(splits)          # returns List[ModelArtifact]
trainer.save_artifacts(artifacts, "models/")

Each artifact is saved as a sub-directory:

models/
├── logistic_regression/
│   ├── model.joblib       # serialised model
│   └── metadata.json      # name, timestamp, params
├── random_forest/
│   └── ...

Evaluation & Selection

Metrics computed (on val and test splits):

Metric Description
auc Area Under the ROC Curve
ks Kolmogorov–Smirnov statistic
lift_10 Lift at top 10% of scores
lift_20 Lift at top 20% of scores
psi Population Stability Index (train vs val)
from scoreflow.evaluation import Evaluator, select_best_model

evaluator = Evaluator(schema)
reports = [evaluator.evaluate(artifact, splits) for artifact in artifacts]

# Primary sort: AUC ↓, tie-break: KS ↓
best = select_best_model(reports, split="val")

Each report is saved as reports/<model_name>_<timestamp>.json.


Scoring API

Built with FastAPI. The app factory pattern (create_app()) isolates state between instances, making parallel test execution safe.

Startup behaviour: At startup, the server scans the --model-dir directory and auto-loads the artifact with the newest timestamp. Override with the SCOREFLOW_MODEL_NAME env var.

Endpoints:

Method Path Description
GET /health Returns model name + version, or 503 if no model loaded
POST /score Scores one or many records, returns probability scores 0–1
GET /docs Auto-generated interactive Swagger UI

Example response from /score:

{
  "scores": [0.823, 0.041, 0.671],
  "model_name": "lightgbm",
  "model_version": "2024-11-01T10:32:11"
}

Error codes: 422 for invalid/empty payload, 503 if no model is loaded.


Exportable Pipeline

Export a trained model to a portable self-contained folder, then run it anywhere — no full ScoreFlow install needed for scoring.

from scoreflow.pipeline.exporter import export_pipeline

pipeline_dir = export_pipeline(
    artifact,
    schema,
    output_dir="exports/"
)
# exports/logistic_regression/
#   ├── model.joblib
#   └── schema.json

Run the exported pipeline on new data:

from scoreflow.pipeline.runner import run_pipeline

scored_df = run_pipeline(
    pipeline_dir="exports/logistic_regression",
    input_path="new_customers.csv",
    output_path="scores.csv"
)
# scored_df has all original columns + a "score" column (0–1 probability)

Logging

ScoreFlow uses Python's standard logging module with an optional JSON-line formatter for production.

from scoreflow.logging_config import configure_logging

configure_logging(json_format=True)   # structured JSON to stdout
configure_logging(json_format=False)  # human-readable (default)

JSON log line example:

{"timestamp": "2024-11-01T10:32:11Z", "level": "INFO", "logger": "scoreflow.api.app", "message": "Auto-selected model artifact: lightgbm"}

CLI Reference

# Start the REST API server
python -m scoreflow serve \
  --model-dir models/          # directory of model artifacts (default: models/)
  --model-name lightgbm        # load a specific model (optional)
  --host 0.0.0.0               # bind host (default: 0.0.0.0)
  --port 8000                  # bind port (default: 8000)

# Score a CSV/Parquet file using an exported pipeline
python -m scoreflow run \
  --pipeline exports/lightgbm  # exported pipeline directory
  --input new_data.csv         # input file (CSV or Parquet)
  --output scores.csv          # output file path

The scoreflow command is also available if installed via pip install -e ..


REST API Reference

GET /health

Returns the status of the service and the currently loaded model.

Response 200:

{
  "status": "ok",
  "model": "lightgbm",
  "version": "2024-11-01T10:32:11"
}

Response 503 — no model loaded.


POST /score

Score one or more records. Each record is a flat JSON object of feature_name → value pairs matching the training schema.

Request body:

{
  "records": [
    {"age": 35, "income": 52000, "debt_ratio": 0.42},
    {"age": 28, "income": 31000, "debt_ratio": 0.65}
  ]
}

Response 200:

{
  "scores": [0.134, 0.812],
  "model_name": "lightgbm",
  "model_version": "2024-11-01T10:32:11"
}

Response 422 — empty records list or invalid feature values.
Response 503 — no model loaded.


Configuration

ScoreFlow is configured via environment variables (all optional):

Variable Default Description
SCOREFLOW_DATA_DIR ./data Root directory for raw data files
SCOREFLOW_MODELS_DIR ./models Root directory for saved model artifacts
SCOREFLOW_REPORTS_DIR ./reports Root directory for evaluation reports
SCOREFLOW_MODEL_DIR ./models Model directory used by the API at startup
SCOREFLOW_MODEL_NAME (auto) Force-load a specific model sub-directory

Global training defaults live in scoreflow/config.py:

Setting Value Description
random_state 42 Fixed seed for reproducibility
test_size 0.2 Fraction of data held out for test
cv_folds 5 Cross-validation folds
primary_metric roc_auc Metric used for model selection

Testing

# Run the full test suite (50 tests)
python -m pytest tests/ -v

# Run with code coverage
python -m pytest tests/ --cov=scoreflow --cov-report=html
open htmlcov/index.html

# Lint
python -m ruff check scoreflow tests

# Type check
python -m mypy scoreflow

Test coverage by module:

Test file Tests Covers
test_config.py 2 Config defaults and path resolution
test_data.py 10 Loaders, schema, splitters, validation
test_models.py 6 All classifiers, trainer, save/load
test_evaluation.py 6 AUC, KS, lift, PSI, evaluator, selection
test_api.py 6 FastAPI health + score endpoints
test_pipeline.py 4 Export structure + batch runner
test_integration.py 1 Full end-to-end: data → train → eval → select → export → run

Project Roadmap

See ROADMAP.md for the full phased build plan.

Completed phases:

  • ✅ Phase 1 — Foundation & project setup
  • ✅ Phase 2 — Data ingestion (loaders, schema, splits, validation)
  • ✅ Phase 3 — Model training & selection (LR, RF, LightGBM + trainer + registry)
  • ✅ Phase 4 — Evaluation (AUC, KS, lift, PSI, model selection, persisted reports)
  • ✅ Phase 5 — Scoring API (FastAPI, /health, /score, model auto-loading)
  • ✅ Phase 6 — Exportable pipeline (exporter + batch runner + CLI)
  • ✅ Phase 7 — Production hardening (structured JSON logging, integration tests)

Upcoming:

  • 🔲 Dockerfile for containerised deployment
  • 🔲 Prometheus metrics (request count, latency, error rate)
  • 🔲 GitHub Actions CI (lint + test on every push)

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scoreflow-0.1.0.tar.gz (36.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scoreflow-0.1.0-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file scoreflow-0.1.0.tar.gz.

File metadata

  • Download URL: scoreflow-0.1.0.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for scoreflow-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7f24a5bf5d1a8d5648f4c8c95c01b910e0aaad2f41f6d87a8f58a02304a6eb28
MD5 7181e22d3d875408ade2c90c85f6e484
BLAKE2b-256 d13f183f1dc05e1bb73b7629b6b6ee220eced84b9112153e1326ed205400308d

See more details on using hashes here.

File details

Details for the file scoreflow-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: scoreflow-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for scoreflow-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 303c1921e094379127ea71210cbf341828484314c4bc8cb997bbefc9805ed643
MD5 25f7d300d0c1cc04a21e719f001bda24
BLAKE2b-256 9c05a25c15297d08e167326083a553b24ddb0105dced05adf78cb4215b4ee36c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page