GPU Financial Firewall — budget enforcement for AI inference
Project description
Lutflow — GPU Financial Firewall
Budget enforcement for AI inference workloads. Track costs, enforce budgets, kill runaway GPU spending in real time.
pip install lutflow
5-Line Quickstart
from lutflow import Client
import openai
client = Client(tenant_id="acme", budget_usd=10.00)
wrapped = client.wrap(openai.OpenAI())
response = wrapped.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
Lutflow tracks every token, calculates cost, and terminates the process if the budget is exceeded.
CLI
# GPU price lookup (powered by TinyFish)
lutflow lookup --gpu nvidia-l4
# Deploy with budget enforcement
lutflow deploy --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --budget 0.10
# Live dashboard
lutflow watch
# Check deployment status
lutflow status <deployment_id>
# Manual kill
lutflow kill <deployment_id> --reason overspend
Installation
pip install lutflow # Core + CLI
pip install lutflow[openai] # With OpenAI wrapper
pip install lutflow[anthropic] # With Anthropic wrapper
pip install lutflow[gemini] # With Google Gemini wrapper
pip install lutflow[all] # All providers + Kafka transport
SDK Usage
Token-Based Pricing (API Providers)
from lutflow import Client, BudgetStrategy
client = Client(
tenant_id="my-tenant",
budget_usd=10.00,
on_budget_exceeded=BudgetStrategy.RAISE_ERROR,
)
wrapped = client.wrap(openai.OpenAI())
response = wrapped.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"Spent: ${client.accumulated_cost_usd:.4f}")
print(f"Remaining: ${client.remaining_budget_usd:.4f}")
Cloud Mode (with Lutflow API)
from lutflow import Client, BudgetStrategy
client = Client(
tenant_id="my-tenant",
api_key="lut_xxx", # Enables cloud reporting
budget_usd=10.00,
on_budget_exceeded=BudgetStrategy.RAISE_ERROR,
)
GPU-Time Pricing (Self-Hosted Models)
from lutflow import Client, PricingMode
client = Client(
tenant_id="my-tenant",
budget_usd=5.00,
pricing_mode=PricingMode.GPU_TIME,
gpu_type="nvidia-l4",
)
client.start_gpu_timer()
# ... run inference ...
cost = client.stop_gpu_timer()
GPU Price Lookup (TinyFish)
from lutflow.tinyfish import TinyFishClient
tf = TinyFishClient() # Uses TINYFISH_API_KEY env var
result = tf.fetch_all_prices(gpu_type="nvidia-l4")
best = result.cheapest
print(f"Cheapest: {best.provider} at ${best.spot_usd_per_hour:.2f}/hr")
Budget Strategies
| Strategy | Behavior |
|---|---|
RAISE_ERROR |
Raises BudgetExceededError (default) |
WARN_ONLY |
Logs warning, continues execution |
CALLBACK |
Calls a custom function you provide |
SELF_KILL |
Sends SIGKILL to the process |
Supported Providers
- OpenAI — GPT-4o, GPT-4, GPT-3.5, embeddings
- Anthropic — Claude 3.5, Claude 3 Opus/Sonnet/Haiku
- Google Gemini — Gemini 2.0 Flash, Gemini 1.5 Pro/Flash
- Self-hosted — vLLM, TGI, BentoML (GPU-time pricing)
Powered by
| Component | Role |
|---|---|
| TinyFish | Live GPU price discovery (Lookup layer) |
| Confluent Kafka | Telemetry stream (lut.telemetry, lut.pcpo-features) |
| Apache Flink | Real-time cost aggregation (TUMBLE 10s windows) |
| Confluent Tableflow | Delta Lake sync for PCPO training |
| Databricks | PCPO model training (XGBoost v2, 21 features) |
| Neo4j AuraDB | DSPM knowledge graph |
| Google Cloud | GKE cluster, eBPF enforcement |
Lutflow is a Confluent AI Accelerator Cohort 3 participant and TinyFish Accelerator participant — the first Latin American team in both programs.
Environment Variables
| Variable | Description |
|---|---|
LUTFLOW_ENDPOINT |
API endpoint (default: https://api.lutflow.dev) |
TINYFISH_API_KEY |
TinyFish API key for live GPU pricing |
Links
- Website: pypi.org/project/lutflow
- Docs: pypi.org/project/lutflow
- GitHub: github.com/Lutflow/lutflow
License
Business Source License 1.1 (BSL 1.1)
- Free for: internal use, development, testing, evaluation, non-commercial use
- Commercial license required for: offering GPU cost management as a service
- Change Date: March 29, 2030 (converts to Apache 2.0)
See LICENSE for full terms. For commercial licensing: licensing@lutflow.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lutflow-0.1.0a3.tar.gz.
File metadata
- Download URL: lutflow-0.1.0a3.tar.gz
- Upload date:
- Size: 47.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
462bd8411a91371baec36215bf01c6091d4a37d0e766929ae805028b173ee242
|
|
| MD5 |
3ca3ffb7f0ef0ae731b30b1cf7812aaf
|
|
| BLAKE2b-256 |
457a0aceae8e2edfb0f16e13b429815ebc844b72b2cfd6d525374c052607754b
|
File details
Details for the file lutflow-0.1.0a3-py3-none-any.whl.
File metadata
- Download URL: lutflow-0.1.0a3-py3-none-any.whl
- Upload date:
- Size: 37.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b63e61c73d1f77e4d1875450d19a195debeb9d7527e06473f618c728a1a34277
|
|
| MD5 |
46d9f476e0eda8dc976c6221d826f663
|
|
| BLAKE2b-256 |
bf0f94fa3a41257e36400bee34ab4088ebfada20809d87a517e56f166ab98ff0
|