Pre-deployment cost intelligence for AI agent workflows

These details have not been verified by PyPI

Project links

Project description

Pretia

Know what your agent will cost before you deploy.

Pre-deployment cost intelligence for AI agent workflows. Two commands, zero config, ~$2. Get distributional cost projections (p50-p99), detect cost time-bombs, and receive dollar-denominated optimization recommendations.

Install

pip install pretia

Quick Start

Zero-cost estimate (static analysis, no execution):

pretia estimate my_agent.py

Full profile (runs your workflow, ~$2, ~3 minutes):

pretia profile run my_agent.py

No config files, no JSONL datasets, no setup. Pretia reads your workflow, generates diverse synthetic inputs, runs 20 profiling runs, detects patterns, and opens an HTML report with projections and recommendations.

Features

Distributional Projections

Cost projections at p50, p75, p90, p95, and p99 — not just averages. For workflows with non-linear behavior (context growth, variable loop counts), Pretia uses Monte Carlo simulation (10K runs) instead of linear scaling.

8 Pattern Detectors

Automatically detects cost risks in your workflow:

Context growth — input tokens increasing with each iteration
Loop count variance — unpredictable iteration counts
High token variance — wide spread between typical and worst-case calls
Step count variance — routing variability across runs
Bimodality — two distinct cost clusters (e.g., cache hit vs. miss)
Cache utilization opportunity — missing prompt caching on supported providers
Zero-execution steps — workflow paths never triggered during profiling
Output token budget — wasteful max_tokens settings or truncation risk

6 Optimization Recommendations

Each recommendation comes with estimated monthly savings in dollars:

Model swap — downshift steps using frontier models for classification tasks
Loop iteration cap — cap iterations where marginal returns diminish
Circuit breaker — hard exit for stuck loops consuming >15% of cost
Enable prompt caching — activate provider caching for repeated system prompts
Filter tool definitions — remove unused tools from step context
Cache re-sent context — eliminate redundant system prompts across consecutive steps

Optimization Score

A 0-100 score measuring workflow cost efficiency. Three zones: red (0-40, needs optimization), amber (41-70, room to improve), green (71-100, well optimized).

Five Input Modes

A friction ladder from zero-effort to maximum precision:

Level	Command	What happens	Cost
0	`pretia estimate workflow.py`	Static code analysis only. No execution.	Free
1	`--input "How do I reset my password?"`	One run + priors for variance estimation.	~$0.10
2	`--auto-generate N` (default)	LLM generates diverse inputs from system prompt.	~$2
3	`--from-langfuse --last 100`	Pull real inputs from Langfuse production traces.	Free
4	`--inputs samples.jsonl`	User-curated test dataset. Maximum precision.	Execution only

Add to Your CI in 2 Minutes

Pretia ships a GitHub Action that comments on every PR with cost analysis.

Diff-only mode (free, default) — static analysis in seconds:

# .github/workflows/pretia.yml
name: Pretia
on: [pull_request]

permissions:
  contents: read
  pull-requests: write  # required for PR comments

jobs:
  cost-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pretia-ai/pretia/action@v1
        with:
          workflow_path: src/agent.py
          cost_threshold: "20"  # fail if cost increases >20%
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Full profile mode (opt-in, ~$2) — real profiling with recommendations:

      - uses: pretia-ai/pretia/action@v1
        with:
          workflow_path: src/agent.py
          mode: profile
          cost_threshold: "20"
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}  # or your provider key

The PR comment shows: optimization score, projected monthly cost, cost delta vs. baseline, and recommendations in a collapsible section.

CLI Commands

pretia estimate workflow.py             # Instant cost estimate (no execution)
pretia profile run workflow.py          # Full profiling (default: --auto-generate 20)
pretia report profile.json              # Generate HTML report from saved profile
pretia recommend profile.json           # Generate optimization recommendations
pretia analyze --from-langfuse          # Analyze Langfuse traces (no execution)
pretia baseline update profile.json     # Save baseline for CI diffing
pretia diff baseline.json new.json      # Compare profiles, show per-step deltas

Supported Frameworks

Framework	Collection method	Install
LangGraph	Callback handler	`pip install pretia[langgraph]`
OpenAI Agents SDK	RunHooks lifecycle	`pip install pretia[openai]`
Qwen-Agent	LLM proxy	`pip install pretia[qwen]`
Generic	`@collector.step()` decorator	`pip install pretia`

How It Works

Data flows through a five-stage pipeline:

Collector — Framework adapters instrument your workflow and emit unified StepRecords
StepRecord — Frozen dataclass capturing one LLM call: model, tokens, cost, timing, tool usage
ProfileStore — Persists profiling sessions as JSON (one workflow x N input runs)
Projection — Distributional scaling (p50-p99) for stable workflows, Monte Carlo for non-linear cases
Recommendation — Rule-based generators produce dollar-denominated optimization suggestions

The projection engine is validated against 13 real-world workflow archetypes (12/13 within 10% projection error).

Positioning

Langfuse tells you what you spent. Pretia tells you what you'll spend. Use both.

Pretia sits above the LLM tooling stack. It detects when other tools are needed — it doesn't replace them. No proxy (use LiteLLM), no routing (use Martian), no tracing (use Langfuse), no evals (use Braintrust).

Development

uv pip install -e ".[dev]"
pytest tests/unit/ -v
ruff check pretia/ tests/
ruff format pretia/ tests/
pyright pretia/

See CLAUDE.md for architecture details and coding conventions.

Contributing

Issues and PRs welcome. Run pytest tests/unit/ and ruff check pretia/ tests/ before opening a PR.

License

BSL 1.1 (Business Source License). Free for all use except offering Pretia as a commercial hosted service. Converts to Apache 2.0 on 2030-06-13.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.3

Jul 1, 2026

1.1.2

Jul 1, 2026

1.1.1

Jul 1, 2026

1.1.0

Jul 1, 2026

1.0.9

Jun 29, 2026

1.0.8

Jun 29, 2026

1.0.7

Jun 29, 2026

1.0.6

Jun 29, 2026

1.0.5

Jun 29, 2026

This version

1.0.4

Jun 29, 2026

1.0.3

Jun 29, 2026

1.0.2

Jun 26, 2026

1.0.1

Jun 26, 2026

1.0.0

Jun 26, 2026

0.0.1

Jun 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pretia-1.0.4.tar.gz (313.9 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pretia-1.0.4-py3-none-any.whl (122.6 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file pretia-1.0.4.tar.gz.

File metadata

Download URL: pretia-1.0.4.tar.gz
Upload date: Jun 29, 2026
Size: 313.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for pretia-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`7da7a7537978da00a47a2e73a13077b535beb47c6bfea4b188b61501ae83edf5`
MD5	`751f14a75d6a4263c32c4e1e65a7703a`
BLAKE2b-256	`8eb708d2794adcdf2f764fbe661641d1a9d8980a4ef9f6dfb8524e9ac2c7af5d`

See more details on using hashes here.

File details

Details for the file pretia-1.0.4-py3-none-any.whl.

File metadata

Download URL: pretia-1.0.4-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 122.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for pretia-1.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f3aa416a40f5f7d889b1d2b4a3cbdcb447a98ea4c9c173fe016e443dac028ce7`
MD5	`af15de61ed0000d855811edbf1985a7e`
BLAKE2b-256	`ce650aace62fcec644260354ec8943614731c3a4ccd46d261bd2a5307f48838c`

See more details on using hashes here.

pretia 1.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Pretia

Install

Quick Start

Features

Distributional Projections

8 Pattern Detectors

6 Optimization Recommendations

Optimization Score

Five Input Modes

Add to Your CI in 2 Minutes

CLI Commands

Supported Frameworks

How It Works

Positioning

Development

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes