MLflow integration for Inspect AI: experiment tracking, execution tracing, and Scout analysis
Project description
inspect-mlflow
MLflow integration for Inspect AI. Provides experiment tracking, execution tracing, and artifact logging for Inspect AI evaluations.
Install
pip install inspect-mlflow
Quick Start
No code changes needed. Hooks auto-register via entry points when the package is installed. Set env vars and run evals as usual.
# Start MLflow server
mlflow server --port 5000
# Set env vars
export MLFLOW_TRACKING_URI="http://localhost:5000"
export MLFLOW_INSPECT_TRACING="true"
# Run evals. Hooks auto-activate.
inspect eval my_task.py --model openai/gpt-4o
Then open http://localhost:5000 to see runs and traces.
What it does
Tracking Hook
Activated when MLFLOW_TRACKING_URI is set. Creates hierarchical MLflow runs mirroring the eval structure.
- Parent run per eval invocation, nested child runs per task
- Task config logged as parameters (model, dataset, solver, temperature)
- Per-sample scores as step metrics
- Model token usage (input/output/total per model)
- Real-time event counting (model calls, tool calls)
- Eval artifacts: per-sample results JSON + full eval log JSON
Tracing Hook
Activated when MLFLOW_INSPECT_TRACING=true is also set. Maps eval execution to MLflow trace spans.
eval_run:6fvmKSZv (CHAIN)
task:task (CHAIN)
sample:gM9UtEAM (CHAIN)
solvers -> generate -> model:openai/gpt-4o-mini (LLM)
scorers -> match -> score (EVALUATOR)
sample:628Qbuhr (CHAIN)
...
Each span captures relevant data:
| Span Type | Data |
|---|---|
| LLM | model name, token counts, temperature, cache, response |
| TOOL | function name, arguments, result, errors |
| EVALUATOR | score value, explanation, target |
Screenshots
Traces list showing an eval run with execution time and status:
Full span tree showing the eval hierarchy (eval_run -> task -> samples -> solvers/scorers):
LLM span detail with model name, token counts, and response text:
Configuration
| Env var | Required | Default | Description |
|---|---|---|---|
MLFLOW_TRACKING_URI |
Yes | - | MLflow server URL |
MLFLOW_EXPERIMENT_NAME |
No | inspect_ai |
Experiment name |
MLFLOW_INSPECT_TRACING |
No | false |
Enable execution tracing |
MLFLOW_INSPECT_LOG_ARTIFACTS |
No | true |
Log eval artifacts |
Example
from inspect_ai import Task, eval
from inspect_ai.dataset import Sample
from inspect_ai.scorer import match
from inspect_ai.solver import generate
# No special imports needed. Hooks auto-register on install.
task = Task(
dataset=[
Sample(input="What is 2 + 2?", target="4"),
Sample(input="What is 3 * 5?", target="15"),
Sample(input="What is 10 - 7?", target="3"),
],
solver=generate(),
scorer=match(),
)
logs = eval(task, model="openai/gpt-4o-mini")
# Results are now in MLflow: runs with metrics + traces with spans
Development
git clone https://github.com/debu-sinha/inspect-mlflow.git
cd inspect-mlflow
uv sync --group dev
uv run pre-commit install
uv run pytest tests/ -v
See CONTRIBUTING.md for details.
Related
- Inspect AI - AI evaluation framework by UK AISI
- MLflow - ML experiment tracking and model management
- Inspect AI hooks docs - How hooks work
- Issue #3547 - Original proposal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file inspect_mlflow-0.1.0.tar.gz.
File metadata
- Download URL: inspect_mlflow-0.1.0.tar.gz
- Upload date:
- Size: 971.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8988427269fb365e85ed8e5c7b2b5989d6a8a2448b16e928c3c8e040bf565ce8
|
|
| MD5 |
bd1faea76789992c13078f044a99278d
|
|
| BLAKE2b-256 |
c4585db6e6c48058e7bcbef2b30a41af192037c343f3eefdac7318b144ae3e2f
|
File details
Details for the file inspect_mlflow-0.1.0-py3-none-any.whl.
File metadata
- Download URL: inspect_mlflow-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaadb20c2f7ceed1d97d8fc2b7d3d344bdf68d6521c908ff04d8e7de1f27fa58
|
|
| MD5 |
5ac1e6534ac369965f95d792deeb26b2
|
|
| BLAKE2b-256 |
22dd042c4fd7b0fa3bcb1c0ae6c3077754658a54543c1e8c70d160d3c3388142
|