Skip to main content

Ondine - The LLM Dataset Engine. SDK for processing tabular datasets using LLMs with reliability, observability, and cost control

Project description

Ondine Logo

Ondine

A prompt is a column. A new DataFrame primitive for LLMs, with five dimensions of production support.

PyPI version Downloads License: MIT Python 3.10+ GitHub stars Tests

ondine.dev · Docs · PyPI

Ondine Demo

Ondine makes LLM calls a first-class DataFrame operation. Define a column with natural language. Ondine computes it at production scale.

from ondine import PipelineBuilder

df = (
    PipelineBuilder.create()
    .from_dataframe(df, input_columns=["review"], output_columns=["sentiment"])
    .with_prompt("Classify the tone of: {review}")
    .with_llm(provider="openai", model="gpt-5.4-mini")
    .build()
    .execute().data
)

The LLM stops being a service you call from your pipeline. It becomes a column function inside it.

Everything else in this README is how Ondine makes that primitive production-true across five dimensions: richer inputs (KB/RAG/OCR), constrained outputs (schemas, grounding), reliable execution (checkpoints, budget caps, adaptive concurrency), full observability, and any LLM backend.

Install

pip install ondine

Python 3.10+. Works with any LLM through LiteLLM: OpenAI, Anthropic, Groq, Mistral, Cerebras, Ollama, MLX, vLLM, SGLang, 100+ others.

30-second quickstart

from ondine import PipelineBuilder

pipeline = (
    PipelineBuilder.create()
    .from_csv("reviews.csv",
              input_columns=["review"],
              output_columns=["sentiment", "topic"])
    .with_prompt("Classify sentiment and extract the key topic from: {review}")
    .with_llm(provider="openai", model="gpt-5.4-mini")
    .with_max_budget(5.00)
    .build()
)

result = pipeline.execute()
print(f"Processed {result.metrics.processed_rows} rows · ${result.costs.total_cost:.2f}")

One builder chain: input columns, prompt, model, budget cap. Multi-column outputs get a JSON parser; schema enforcement, checkpointing, and cost tracking are on by default.

Prefer a one-liner? QuickPipeline.create(...) wraps the same builder with sensible defaults (see examples/).

The 5 dimensions

1. INPUTS: make the prompt richer

Feed the LLM more than raw column text. Pull context from documents, images, and prior runs.

  • Knowledge Base (RAG): ingest PDFs, Markdown, HTML, images via OCR. Hybrid BM25 + dense search with optional cross-encoder reranker. HyDE / multi-query / step-back query transforms.
  • OCR: three pluggable backends: multimodal Vision LLM, Tesseract (offline), DocTR.
  • Multi-column placeholders: use any number of input columns in one prompt ({col_a}, {col_b}).
  • Jinja2 templates + system prompts for richer prompt shaping.

2. OUTPUTS: constrain what comes back

Stop parsing strings. Get typed columns, validated against your schema, verified against your evidence.

  • Pydantic structured output: define a model, get typed columns back. Malformed JSON auto-retries up to 3x.
  • Multi-column parsing: one prompt → N typed columns.
  • Grounding verification (Context Store): each LLM answer checked against an evidence graph built from your dataset. Rust + SQLite + FTS5 backend. Contradictions flagged, not silently returned.

3. EXECUTION: run N rows reliably

Production plumbing that df.apply() doesn't give you.

  • Checkpointing to Parquet after every batch. Durable SQLite response cache for crash-atomic resume (A4, #144).
  • Hard budget caps: pre-run cost estimation, live tracking, halts the pipeline at your USD limit.
  • Multi-row batching: pack N rows per API call. 200 calls instead of 10,000 at batch_size=50.
  • Prefix caching: system prompt cached across batches. 40–50% token savings.
  • Adaptive concurrency: Netflix Gradient2 algorithm. Shrinks on 429, grows on saturation.
  • Retry-After parsing across 5 header shapes (OpenAI / Anthropic / Groq / RFC 7231 / ms-delta).
  • Distributed rate limiting via Redis (atomic Lua token bucket, cluster-aware).

4. OBSERVATION: see what happened

On by default. Integrates with the observability stack you already run.

  • ProgressBar + Logging + CostTracking observers active on every run.
  • Langfuse for LLM trace logging.
  • OpenTelemetry for distributed tracing.
  • Prometheus metrics export (request count, duration histogram, cost gauge).
  • Decimal precision for cost tracking (no floating-point surprises).

5. PROVIDERS: any LLM backend

  • 100+ providers via LiteLLM. Swap with a string.
  • Router with latency-based failover and automatic provider selection.
  • Local inference: Ollama, MLX (Apple Silicon), vLLM, SGLang.
  • Azure Managed Identity with 3 auth patterns (MI, API key, pre-fetched token).
  • Custom endpoints: any OpenAI-compatible API.

Beyond the quickstart

from ondine import PipelineBuilder
from ondine.knowledge import KnowledgeStore
from ondine.context import RustContextStore
from pydantic import BaseModel

class ReviewAnalysis(BaseModel):
    sentiment: str
    score: int
    topic: str

kb = KnowledgeStore("knowledge.db")
kb.ingest("docs/")   # PDFs, MD, HTML, images via OCR

pipeline = (
    PipelineBuilder.create()
    .from_csv("reviews.csv",
              input_columns=["review"],
              output_columns=["sentiment", "score", "topic"])
    .with_knowledge_base(kb, top_k=5, rerank=True, query_transform="hyde")
    .with_prompt("Context:\n{_kb_context}\n\nAnalyze: {review}")
    .with_llm(provider="openai", model="gpt-5.4-mini")
    .with_structured_output(ReviewAnalysis)
    .with_context_store(RustContextStore("evidence.db"))
    .with_grounding(threshold=0.3)
    .with_batch_size(50)
    .with_max_budget(25.00)
    .with_checkpoint_interval(100)
    .with_disk_cache(".cache")
    .with_router(strategy="latency")
    .with_observer("langfuse")
    .build()
)

result = pipeline.execute()

Every chained method maps to one of the five dimensions. See docs.ondine.dev for the full reference.

What "a prompt is a column" unlocks

Same primitive. The use case lives in the prompt.

Transform Prompt pattern
Classification "Classify {text} into one of {labels}"
Extraction "Extract name, date, amount from: {document}"
Scoring "Score {item} against {criteria} on 1–10"
Comparison "Is {a} equivalent to {b}? Return yes/no + reason."
Translation "Translate {text} from {src_lang} to {tgt_lang}"
Summarization "Summarize {document} in 3 bullets"

One abstraction. Any transform.

Compared to alternatives

Tool Primitive Why pick Ondine
Instructor f(prompt) → Pydantic (one call) Ondine applies that pattern to N rows, with the 5 dimensions
Pandas-AI df.chat("question") Different primitive (query vs. compute)
LangChain batch chain.batch([...]) No budget cap, no grounding, no observability defaults
OpenAI/Anthropic Batch API Provider-specific batch No multi-provider, no grounding, no crash-safety, 24-hour turnaround
Airflow/Prefect/Dagster Workflow orchestrators Heavy setup, no LLM-specific features. Ondine ships integrations for them.
Ondine Prompt(columns) → new_columns A primitive, not a wrapper

Local inference

from ondine import QuickPipeline

# Ollama
pipeline = QuickPipeline.create(
    data="reviews.csv",
    prompt="Classify sentiment: {review}",
    output_columns=["sentiment"],
    model="ollama/qwen3.5",
)

# MLX (Apple Silicon, native; no server process)
pipeline = QuickPipeline.create(
    data="reviews.csv",
    prompt="Classify sentiment: {review}",
    output_columns=["sentiment"],
    model="mlx/mlx-community/Llama-4-Scout-Instruct-4bit",
)

No API keys. No telemetry. Fully offline.

Documentation

Contributing

PRs welcome. See CONTRIBUTING.md. Code style: Black + Ruff. Tests required for new features.

License

MIT. See LICENSE.

Acknowledgments

  • LiteLLM: provider routing layer
  • Instructor: the single-call pattern Ondine applies at DataFrame scale
  • The Pydantic team: validation backbone

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ondine-1.9.1.tar.gz (254.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ondine-1.9.1-cp311-cp311-manylinux_2_34_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

File details

Details for the file ondine-1.9.1.tar.gz.

File metadata

  • Download URL: ondine-1.9.1.tar.gz
  • Upload date:
  • Size: 254.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for ondine-1.9.1.tar.gz
Algorithm Hash digest
SHA256 4f74c1ac4bf575585470c54194c8ab5ae1a136b63c034215eccc11bf05266f6e
MD5 ff30e5e0d3d2358e4b482bf564739395
BLAKE2b-256 49764df4d704db208aed2743e52c00454737ad3429e8055185e764c0dec63126

See more details on using hashes here.

File details

Details for the file ondine-1.9.1-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ondine-1.9.1-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d92823769e31c60505feab25ae33f07aa9f390e5ad2918609f0240308edf218a
MD5 2e49478adf7b02e8bd33ea1dba33be40
BLAKE2b-256 667b07da93b44afc5b27188a05ddaaf63e9833cff8fae091713ca9f3d08eac5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page