Developer-first model inventory and governance framework for SR 11-7, EU AI Act, and NIST AI RMF compliance
Project description
model-ledger
The model inventory your regulator actually wants. Auto-discovered, dependency-traced, audit-ready.
model-ledger automatically discovers models, rules, pipelines, and queues across your systems — then builds the dependency graph between them.
from model_ledger import Ledger, DataNode
ledger = Ledger.from_sqlite("./inventory.db")
ledger.add([
DataNode("segmentation", platform="etl", outputs=["customer_segments"]),
DataNode("fraud_scorer", platform="ml", inputs=["customer_segments"], outputs=["risk_scores"]),
DataNode("fraud_alerts", platform="alerting", inputs=["risk_scores"]),
])
ledger.connect()
ledger.trace("fraud_alerts")
# ['segmentation', 'fraud_scorer', 'fraud_alerts']
graph LR
A["segmentation<br/><small>ETL pipeline</small>"] -->|customer_segments| B["fraud_scorer<br/><small>ML model</small>"]
B -->|risk_scores| C["fraud_alerts<br/><small>Alert queue</small>"]
style A fill:#607D8B,color:#fff,stroke:#455A64
style B fill:#4CAF50,color:#fff,stroke:#388E3C
style C fill:#FF9800,color:#fff,stroke:#F57C00
Unlike model registries that track ML models only, model-ledger tracks the entire model risk ecosystem — ETL pipelines, heuristic rules, scoring jobs, alert queues, and ML models — as one connected graph with a full audit trail.
Install
pip install model-ledger # Core + SQLite backend
pip install model-ledger[snowflake] # + Snowflake backend
pip install model-ledger[rest] # + REST API connector
pip install model-ledger[github] # + GitHub connector
pip install model-ledger[all] # Everything
How It Works
graph TB
subgraph discover ["1. Discover"]
direction LR
DB["SQL databases"] --> F["sql_connector()"]
API["REST APIs"] --> G["rest_connector()"]
GH["GitHub repos"] --> H["github_connector()"]
CUSTOM["Your platform"] --> I["SourceConnector protocol"]
end
subgraph ledger ["2. Build Graph"]
direction LR
ADD["ledger.add()"] --> CON["ledger.connect()"]
CON --> |"match output ports<br/>to input ports"| GRAPH["Dependency graph"]
end
subgraph query ["3. Query"]
direction LR
TRACE["trace()"] ~~~ UP["upstream()"] ~~~ DOWN["downstream()"] ~~~ INV["inventory_at()"]
end
discover --> ledger --> query
style discover fill:#E3F2FD,stroke:#1565C0,color:#0D47A1
style ledger fill:#E8F5E9,stroke:#2E7D32,color:#1B5E20
style query fill:#FFF3E0,stroke:#E65100,color:#BF360C
Every model is a DataNode with typed input and output ports. When an output port name matches an input port name, connect() creates the dependency edge automatically.
Every mutation is recorded as an immutable Snapshot — an append-only event log. Nothing is deleted. This gives you a complete audit trail and point-in-time inventory reconstruction for any date.
Discover Models From Your Systems
SQL databases
Most discovery is "query a table, map rows to models." The sql_connector factory handles this without writing classes:
from model_ledger import Ledger, sql_connector
ledger = Ledger.from_sqlite("./inventory.db")
# Simple: discover from a registry table
models = sql_connector(
name="model_registry",
connection=my_db,
query="SELECT name, owner, status FROM ml_models WHERE active = true",
name_column="name",
)
# Advanced: auto-parse SQL to extract table dependencies
etl_jobs = sql_connector(
name="etl_scheduler",
connection=my_db,
query="SELECT job_name, raw_sql, cron FROM scheduled_jobs",
name_column="job_name",
sql_column="raw_sql", # extracts FROM/JOIN as inputs, INSERT/CREATE as outputs
)
ledger.add(models.discover())
ledger.add(etl_jobs.discover())
ledger.connect() # auto-links ETL outputs to model inputs
REST APIs
from model_ledger import rest_connector
# Works with MLflow, SageMaker, Vertex AI, or any JSON API
ml_models = rest_connector(
name="mlflow",
url="https://mlflow.internal/api/2.0/mlflow/registered-models/list",
headers={"Authorization": "Bearer ..."},
items_path="registered_models",
name_field="name",
)
GitHub repos
from model_ledger import github_connector
# Discover pipeline-as-code: Airflow DAGs, dbt projects, scoring pipelines
pipelines = github_connector(
name="ml_pipelines",
repos=["myorg/ml-scoring"],
token="ghp_...",
project_path="projects",
config_file="deploy.yaml",
parser=my_yaml_parser, # (project_name, file_content) -> DataNode
)
Custom connectors
For anything the factories don't cover, implement the SourceConnector protocol:
class SageMakerConnector:
name = "sagemaker"
def discover(self) -> list[DataNode]:
endpoints = boto3.client("sagemaker").list_endpoints()
return [
DataNode(ep["EndpointName"], platform="sagemaker",
outputs=[ep["EndpointName"]],
metadata={"status": ep["EndpointStatus"]})
for ep in endpoints["Endpoints"]
]
Persistent Storage
from model_ledger import Ledger
ledger = Ledger.from_sqlite("./inventory.db") # SQLite — zero infrastructure
ledger = Ledger.from_snowflake(connection, schema="DB.MODEL_LEDGER") # Snowflake — production scale
ledger = Ledger() # In-memory — testing
ledger = Ledger(my_custom_backend) # Custom — LedgerBackend protocol
Key Capabilities
Dependency tracing
ledger.trace("fraud_alerts") # Full pipeline path
ledger.upstream("fraud_alerts") # Everything that feeds this
ledger.downstream("segmentation") # Everything that depends on this
ledger.dependencies("fraud_alerts", direction="upstream") # Detailed with relationship info
Shared table disambiguation
When multiple models write to the same table, DataPort handles precision matching:
from model_ledger import DataPort, DataNode
# Two models write to the same alert table with different model_name values
DataNode("check_rules", outputs=[DataPort("alerts", model_name="checks")])
DataNode("card_rules", outputs=[DataPort("alerts", model_name="cards")])
# This reader only connects to check_rules — model_name must match
DataNode("check_queue", inputs=[DataPort("alerts", model_name="checks")])
Point-in-time inventory
from datetime import datetime
inventory = ledger.inventory_at(datetime(2025, 12, 31))
# Every model that was active on that date
Compliance validation
Built-in profiles for major model risk regulations:
| Profile | Regulation | Checks |
|---|---|---|
sr_11_7 |
US Federal Reserve SR 11-7 | Validator independence, governance docs, validation schedule |
eu_ai_act |
EU AI Act (2024/1689) | Risk classification, data governance, human oversight |
nist_ai_rmf |
NIST AI RMF 1.0 | GOVERN, MAP, MEASURE, MANAGE functions |
Model introspection
Extract metadata from fitted ML models:
from model_ledger import introspect
result = introspect(fitted_model)
result.algorithm # "XGBClassifier"
result.features # [FeatureInfo(name="velocity_30d", ...), ...]
result.hyperparameters # {"n_estimators": 50, "max_depth": 4}
Ships with sklearn, XGBoost, and LightGBM support. Add your own via the Introspector protocol.
Design Principles
- Everything is a DataNode — ML models, heuristic rules, ETL pipelines, alert queues. One abstraction.
- The graph builds itself — declare inputs and outputs. Dependencies follow from port matching.
- Schema-agnostic metadata —
Snapshot.payloadisdict[str, Any]. The framework stores whatever your connectors discover. - Append-only audit trail — every change is an immutable Snapshot. Full history, point-in-time queries.
- Factory for the 80%, protocol for the 20% — config-driven factories for common patterns, open protocols for anything custom.
- Batteries included — persistence, discovery, graph building, and compliance with zero infrastructure.
For Organizations
model-ledger is designed as a core framework with lightweight organization-specific extensions. The OSS core handles graph building, storage, compliance, and the connector factories. Your internal package provides:
- Connector configs — point
sql_connector()at your tables,rest_connector()at your APIs - Custom connectors — for internal platforms the factories don't cover
- Authentication — your database/API credentials and auth wrappers
- Additional compliance profiles — OSFI E-23, PRA SS1/23, MAS AIRG, or internal policies
Your internal repo should be thin config and credentials, not reimplemented logic.
Contributing
See CONTRIBUTING.md. All commits require DCO sign-off.
License
Apache-2.0. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file model_ledger-0.5.0.tar.gz.
File metadata
- Download URL: model_ledger-0.5.0.tar.gz
- Upload date:
- Size: 101.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7197714d2e3cca012598aa9860cf44444873806915b36c31cdeee79d39850a2
|
|
| MD5 |
74c44c9ec9e12b14e57ca16222fb96ba
|
|
| BLAKE2b-256 |
3b902afd170c1706e66a1c366fa27b5b7d079604235ba55563ab16c90bd43373
|
Provenance
The following attestation bundles were made for model_ledger-0.5.0.tar.gz:
Publisher:
release.yml on block/model-ledger
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
model_ledger-0.5.0.tar.gz -
Subject digest:
b7197714d2e3cca012598aa9860cf44444873806915b36c31cdeee79d39850a2 - Sigstore transparency entry: 1265852547
- Sigstore integration time:
-
Permalink:
block/model-ledger@03f1dd75a88ac43a4f1dcd991b4efa2661c024ae -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/block
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@03f1dd75a88ac43a4f1dcd991b4efa2661c024ae -
Trigger Event:
release
-
Statement type:
File details
Details for the file model_ledger-0.5.0-py3-none-any.whl.
File metadata
- Download URL: model_ledger-0.5.0-py3-none-any.whl
- Upload date:
- Size: 79.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d67cc839965629d0d51249f5ebc388a61c57a742858b2eaa8fa6d6023869bac8
|
|
| MD5 |
eb50ed5bfa71c86e55d6c0d37b3474af
|
|
| BLAKE2b-256 |
67e0d9ba84d227428e5eef7ce04574c152afa4530b81c05826d77f82c29dcc0a
|
Provenance
The following attestation bundles were made for model_ledger-0.5.0-py3-none-any.whl:
Publisher:
release.yml on block/model-ledger
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
model_ledger-0.5.0-py3-none-any.whl -
Subject digest:
d67cc839965629d0d51249f5ebc388a61c57a742858b2eaa8fa6d6023869bac8 - Sigstore transparency entry: 1265852613
- Sigstore integration time:
-
Permalink:
block/model-ledger@03f1dd75a88ac43a4f1dcd991b4efa2661c024ae -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/block
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@03f1dd75a88ac43a4f1dcd991b4efa2661c024ae -
Trigger Event:
release
-
Statement type: