Skip to main content

YAML-driven AI/LLM pipeline executor

Project description

ai-pipelines

YAML-driven AI/LLM pipeline executor. Define multi-step workflows in YAML, run them with a single Python call.

What it does

  • Describe pipelines as YAML files: read files, chunk text, call LLMs, loop, transform data, evaluate outputs
  • Steps reference each other by name using JSONata expressions
  • LLM prompt templates use Jinja2
  • Built-in LLM-as-judge evaluation with 7 scoring strategies
  • Returns per-step timing and total cost

Installation

Requires Python 3.12+ and uv.

uv sync

Quick Start

import asyncio
from ai_pipelines import load_pipeline, run_pipeline

async def main():
    pipeline = load_pipeline("my_pipeline.yaml")
    result = await run_pipeline(pipeline, {"text": "Hello world"})
    print(result.output)
    print(f"Cost: ${result.total_cost_usd:.4f}")

asyncio.run(main())

Minimal pipeline YAML

input:
  type: object
  properties:
    text: { type: string }
  required: [text]

steps:
  - kind: prompt
    name: summary
    model: haiku
    arguments: '{"text": input.text}'
    template: "Summarize this: {{ args.text }}"

Pipeline YAML Reference

Every pipeline has input (JSON Schema) and steps (ordered list). Each step requires kind and name. All prior step results are available by name via JSONata expressions.

Step types

kind What it does Key fields
read_file Read file contents as text arguments: JSONata path to filename
find_files Glob file discovery arguments: base dir, pattern: glob
transform Evaluate a JSONata expression arguments: any JSONata expression
chunk Split text into overlapping chunks arguments: text source, chunk_size (default 4000), overlap (default 200)
prompt LLM call with Jinja2 template model, template, arguments, output (optional JSON schema), system_prompt
for_each Loop over an array, run nested steps arguments: array source, steps: list
evaluate LLM-as-judge scoring strategy, arguments, model (default haiku)

Data flow

  • input.field — pipeline input fields
  • step_name.field — prior step output
  • Inside for_each, item is the current element and item_index its position
  • prompt templates use {{ args.field }} where args is the resolved arguments dict

Structured output

Add output with a JSON Schema to a prompt step to get a typed dict back:

- kind: prompt
  name: result
  model: sonnet
  arguments: '{"doc": input.text}'
  template: "Extract key points from: {{ args.doc }}"
  output:
    type: object
    properties:
      points: { type: array, items: { type: string } }
    required: [points]

Evaluate strategies

strategy Required argument keys Score
summarization source, summary (qa_score + conciseness) / 2
faithfulness source, response supported_claims / total_claims
hallucination context, response 1 - contradicted / total
factual_accuracy question, context, response mean of per-fact scores
context_relevance question, context full=1.0 / partial=0.5 / none=0.0
context_utilization question, context, response full=1.0 / partial=0.5 / none=0.0
context_conciseness question, context, concise_context full=1.0 / partial=0.5 / none=0.0

Running Tests

just test
# or
uv run pytest

Project Structure

src/ai_pipelines/
  __init__.py         Public API
  models.py           Pydantic models for all step types + discriminated union
  executor.py         Async orchestrator (run_pipeline)
  loader.py           YAML parsing + JSON Schema validation
  context.py          PipelineContext: scoped dict for step results
  expressions.py      JSONata evaluator
  templates.py        Jinja2 renderer
  validator.py        Static pre-flight validation
  errors.py           Exception hierarchy
  pipeline_logger.py  Structured JSON-lines logging
  steps/              One file per step kind
e2e/                  End-to-end example scripts and pipelines
tests/                pytest suite

Public API

from ai_pipelines import (
    configure_logging,
    load_pipeline,
    validate_pipeline,
    load_and_validate_pipeline,
    run_pipeline,
    validate_input,
)

All exceptions inherit from PipelineError:

Exception When raised
PipelineLoadError YAML parse failure or bad structure
ValidationError Input/output JSON Schema mismatch
ExpressionError Invalid or failing JSONata expression
StepExecutionError Any step fails at runtime
LLMError LLM call fails or returns unparseable output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_pipelines-0.1.0.tar.gz (262.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_pipelines-0.1.0-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file ai_pipelines-0.1.0.tar.gz.

File metadata

  • Download URL: ai_pipelines-0.1.0.tar.gz
  • Upload date:
  • Size: 262.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ai_pipelines-0.1.0.tar.gz
Algorithm Hash digest
SHA256 063c9ae023fc00ca5d7581a6926104d48c3a2ff19346f77579bff8ee61844fec
MD5 0f98ca98fc2d548cdfb172925f7628c1
BLAKE2b-256 264e5957638f990d33f1b998a112b751bde1c38eea4626a9f2d1efc2b153503b

See more details on using hashes here.

File details

Details for the file ai_pipelines-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ai_pipelines-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 29.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ai_pipelines-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ced9d4df705979f7e95a7adf3693904093ee9be51c987d8eeb02703ef04a64c2
MD5 08ef07e9a446627e9cdee99a6ce35eae
BLAKE2b-256 d529e9da458d5ee8e1db300e3b1eef4c90afdb3641c6d77051a2ff7916ac4a9c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page