LLM-first conversational ETL pipeline generator
Project description
Osiris Pipeline v0.5.1
The deterministic compiler for AI-native data pipelines. You describe outcomes in plain English; Osiris compiles them into reproducible, production-ready manifests that run with the same behavior everywhere (local or cloud).
🚀 Quick Start
# Setup
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Initialize configuration
osiris init
# Start MCP server for AI integration (Claude Desktop, etc.)
osiris mcp
🎯 What Makes Osiris Different
- Compiler, not orchestrator - Others schedule what you hand-craft. Osiris generates, validates, and compiles pipelines from plain English.
- Determinism as a contract - Fingerprinted manifests guarantee reproducibility across environments.
- Conversational → executable - Describe intent; Osiris interrogates real systems and proposes a feasible plan.
- Run anywhere, same results - Transparent adapters deliver execution parity (local and E2B today).
- Boring by design - Predictable, explainable, portable — industrial-grade AI, not magical fragility.
📊 Visual Overview
Pipeline Execution Dashboard
Interactive HTML dashboard showing pipeline execution metrics and performance
Run Overview with E2B Integration
Comprehensive run overview showing E2B cloud execution with <1% overhead
Step-by-Step Pipeline Execution
Detailed view of pipeline steps with row counts and execution times
Example Usage via MCP
# Start the MCP server
$ osiris mcp
# Use with Claude Desktop or any MCP-compatible client to:
# - Discover database schemas and sample data
# - Generate SQL queries and transformations
# - Validate and compile pipelines
# - Execute with deterministic, reproducible results
# Or run pipelines directly:
$ osiris run examples/inactive_customers.yaml
✨ Key Features
- AI-native pipeline generation from plain English descriptions
- Deterministic compilation with fingerprinted, reproducible manifests
- Run anywhere with identical behavior (local or E2B cloud)
- Interactive HTML reports with comprehensive observability
- AI Operation Package (AIOP) for LLM-friendly debugging and analysis
- LLM-friendly with machine-readable documentation for AI assistants
🤖 LLM-Friendly Documentation
Osiris provides machine-readable documentation for AI assistants:
- For Users: Share
docs/user-guide/llms.txtwith ChatGPT/Claude to generate pipelines - For Developers: Use
docs/developer-guide/llms.txtfor AI-assisted development - Pro Mode: Customize AI behavior with
osiris dump-prompts --export
🚀 E2B Cloud Execution
Run pipelines in isolated E2B sandboxes with <1% overhead:
# Run in cloud sandbox
osiris run pipeline.yaml --e2b
# With custom resources
osiris run pipeline.yaml --e2b --e2b-cpu 4 --e2b-mem 8
See the User Guide for complete E2B documentation.
🤖 AI Operation Package (AIOP)
Every pipeline run automatically generates a comprehensive AI Operation Package for LLM analysis:
# View AIOP export after any run
osiris logs aiop --last
# Generate human-readable summary
osiris logs aiop --last --format md
# Configure in osiris.yaml
aiop:
enabled: true # Auto-export after each run
policy: core # ≤300KB for LLM consumption
AIOP provides four semantic layers for AI understanding:
- Evidence Layer: Timestamped events, metrics, and artifacts
- Semantic Layer: DAG structure and component relationships
- Narrative Layer: Natural language descriptions with citations
- Metadata Layer: LLM primer and configuration
See AIOP Architecture for details.
📚 Documentation
For comprehensive documentation, visit the Documentation Hub:
- Quickstart - 10-minute setup guide
- User Guide - Complete usage documentation
- Architecture - Technical deep-dive with diagrams
- Developer Guide - Module patterns and LLM contracts
- Examples - Ready-to-use pipelines
🚦 Roadmap
- v0.2.0 ✅ - Conversational agent, deterministic compiler, E2B parity
- v0.3.0 ✅ - AI Operation Package (AIOP) for LLM-friendly debugging
- v0.3.1 ✅ - Fixed validation warnings for ADR-0020 compliant configs
- v0.3.5 ✅ - GraphQL extractor, DuckDB processor, test infrastructure improvements
- v0.5.1 (Current) ✅ - Critical bug fixes batch 2: OML sample, OSIRIS_HOME, Windows shell, PYTHONPATH, Guide references
- M2 - Production workflows, approvals, orchestrator integration
- M3 - Streaming, parallelism, enterprise scale
- M4 - Iceberg tables, intelligent DWH agent
See docs/roadmap/ for details.
🛠️ Contributing
See CONTRIBUTING.md for development workflow, code quality standards, and commit guidelines.
License
Apache-2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file osiris_pipeline-0.5.1.tar.gz.
File metadata
- Download URL: osiris_pipeline-0.5.1.tar.gz
- Upload date:
- Size: 438.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c49d89ea2a8a3e4c4241579835db3d459602c1bccc04863053a7f9e4b0bcb5d
|
|
| MD5 |
20f198b505b3cc5c46aedd6a8e5a964a
|
|
| BLAKE2b-256 |
020bc3cfc74c95101b412a47ace750d9df958fc98882d56f3badcad454b041ba
|
File details
Details for the file osiris_pipeline-0.5.1-py3-none-any.whl.
File metadata
- Download URL: osiris_pipeline-0.5.1-py3-none-any.whl
- Upload date:
- Size: 486.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f4b6515150bd1bb8ac2888cc90a534eee4ae97134c07b2feb555a4419dc996e
|
|
| MD5 |
7a75d025c86d3ee8741ca4e473247838
|
|
| BLAKE2b-256 |
f73287ff9c686fd79fe39fa3501f8205a1e3dbdaa0db05441367d38a35355411
|