Skip to main content

GPU Financial Firewall — budget enforcement for AI inference

Project description

Lutflow — GPU Financial Firewall

Budget enforcement for AI inference workloads. Track costs, enforce budgets, kill runaway GPU spending in real time.

pip install lutflow

5-Line Quickstart

from lutflow import Client
import openai

client = Client(tenant_id="acme", budget_usd=10.00)
wrapped = client.wrap(openai.OpenAI())
response = wrapped.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}])

Lutflow tracks every token, calculates cost, and terminates the process if the budget is exceeded.

CLI

# GPU price lookup (powered by TinyFish)
lutflow lookup --gpu nvidia-l4

# Deploy with budget enforcement
lutflow deploy --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --budget 0.10

# Live dashboard
lutflow watch

# Check deployment status
lutflow status <deployment_id>

# Manual kill
lutflow kill <deployment_id> --reason overspend

Installation

pip install lutflow            # Core + CLI
pip install lutflow[openai]    # With OpenAI wrapper
pip install lutflow[anthropic] # With Anthropic wrapper
pip install lutflow[gemini]    # With Google Gemini wrapper
pip install lutflow[all]       # All providers + Kafka transport

SDK Usage

Token-Based Pricing (API Providers)

from lutflow import Client, BudgetStrategy

client = Client(
    tenant_id="my-tenant",
    budget_usd=10.00,
    on_budget_exceeded=BudgetStrategy.RAISE_ERROR,
)

wrapped = client.wrap(openai.OpenAI())
response = wrapped.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(f"Spent: ${client.accumulated_cost_usd:.4f}")
print(f"Remaining: ${client.remaining_budget_usd:.4f}")

Cloud Mode (with Lutflow API)

from lutflow import Client, BudgetStrategy

client = Client(
    tenant_id="my-tenant",
    api_key="lut_xxx",  # Enables cloud reporting
    budget_usd=10.00,
    on_budget_exceeded=BudgetStrategy.RAISE_ERROR,
)

GPU-Time Pricing (Self-Hosted Models)

from lutflow import Client, PricingMode

client = Client(
    tenant_id="my-tenant",
    budget_usd=5.00,
    pricing_mode=PricingMode.GPU_TIME,
    gpu_type="nvidia-l4",
)

client.start_gpu_timer()
# ... run inference ...
cost = client.stop_gpu_timer()

GPU Price Lookup (TinyFish)

from lutflow.tinyfish import TinyFishClient

tf = TinyFishClient()  # Uses TINYFISH_API_KEY env var
result = tf.fetch_all_prices(gpu_type="nvidia-l4")

best = result.cheapest
print(f"Cheapest: {best.provider} at ${best.spot_usd_per_hour:.2f}/hr")

Budget Strategies

Strategy Behavior
RAISE_ERROR Raises BudgetExceededError (default)
WARN_ONLY Logs warning, continues execution
CALLBACK Calls a custom function you provide
SELF_KILL Sends SIGKILL to the process

Supported Providers

  • OpenAI — GPT-4o, GPT-4, GPT-3.5, embeddings
  • Anthropic — Claude 3.5, Claude 3 Opus/Sonnet/Haiku
  • Google Gemini — Gemini 2.0 Flash, Gemini 1.5 Pro/Flash
  • Self-hosted — vLLM, TGI, BentoML (GPU-time pricing)

Powered by

Component Role
TinyFish Live GPU price discovery (Lookup layer)
Confluent Kafka Telemetry stream (lut.telemetry, lut.pcpo-features)
Apache Flink Real-time cost aggregation (TUMBLE 10s windows)
Confluent Tableflow Delta Lake sync for PCPO training
Databricks PCPO model training (XGBoost v2, 21 features)
Neo4j AuraDB DSPM knowledge graph
Google Cloud GKE cluster, eBPF enforcement

Lutflow is a Confluent AI Accelerator Cohort 3 participant and TinyFish Accelerator participant — the first Latin American team in both programs.

Environment Variables

Variable Description
LUTFLOW_ENDPOINT API endpoint (default: https://api.lutflow.dev)
TINYFISH_API_KEY TinyFish API key for live GPU pricing

Links

License

Business Source License 1.1 (BSL 1.1)

  • Free for: internal use, development, testing, evaluation, non-commercial use
  • Commercial license required for: offering GPU cost management as a service
  • Change Date: March 29, 2030 (converts to Apache 2.0)

See LICENSE for full terms. For commercial licensing: licensing@lutflow.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lutflow-0.1.0a2.tar.gz (43.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lutflow-0.1.0a2-py3-none-any.whl (34.5 kB view details)

Uploaded Python 3

File details

Details for the file lutflow-0.1.0a2.tar.gz.

File metadata

  • Download URL: lutflow-0.1.0a2.tar.gz
  • Upload date:
  • Size: 43.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for lutflow-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 83aad14e9cb984f337e881858926ccfb1ae7cd676acf53f5ad1c7c1b867d2012
MD5 6b9c22df0d29a1e3afcc51a3f5d06bc5
BLAKE2b-256 c72c2552703d4aeb499ccbb4812f38becfea3bbd601ffb1545c98a2a52349e8c

See more details on using hashes here.

File details

Details for the file lutflow-0.1.0a2-py3-none-any.whl.

File metadata

  • Download URL: lutflow-0.1.0a2-py3-none-any.whl
  • Upload date:
  • Size: 34.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for lutflow-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 b3b85f1d0724d3de29180ec79d76714aafcf6e2009bd0b043d1b1355a35d84c8
MD5 7c05cc1dc14ed8c05d90d2c6b9886239
BLAKE2b-256 7e07332d937717fb0606d78de6a713313c084f24d7086f6a10c2a2c07724a378

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page