Skip to main content

Vincio: the context engineering platform for AI applications. Compiles prompts, memory, retrieval, tools, schemas, and policies into optimized, validated, observable model-ready context packets.

Project description

Vincio — the context engineering platform for AI applications

The scarce resource is not the model. It is the context you feed it.

PyPI version CI Python 3.11+ Apache 2.0 195 tests passing Ruff Pydantic v2 Offline-first


Vincio is a Python platform for building context-engineered AI applications. It compiles prompts, memory, retrieval, tools, schemas, and policies into optimized, testable, observable, provider-neutral context packets — then validates and evaluates every output.

Most LLM frameworks help you call a model. Vincio governs the boundary between your application state and the model: what evidence is selected, how it is scored and budgeted, how it is rendered for cache reuse, and how the result is validated, measured, and traced. Named for Leonardo da Vinci — engineering and craft in equal measure.

Raw Input → Normalization → Objective Detection → Memory Selection
→ Retrieval Planning → Evidence Retrieval → Ranking + Distillation
→ Tool Planning → Context Compilation → Model Execution
→ Parsing + Validation → Evaluation + Guardrails → Trace + Learning Loop

Contents

Why Vincio · Install · 60-second quickstart · Features · Benchmarks · Comparison · Use cases · Examples · CLI · Architecture · Roadmap · Documentation

Why Vincio

Teams ship a prompt, watch it work, then spend months fighting everything around it: context that overflows the window, retrieved chunks that contradict each other, outputs that fail to parse, silent quality regressions, untraceable costs, and prompt-injection risk. These are not model problems — they are context problems.

Vincio treats context as a compiled artifact with a clear contract:

  • Deterministic where it matters. Security, permissions, and validation are enforced in code — never gated on model output. The same input compiles to the same packet.
  • Measured, not asserted. Every run is traced and costed; every change can be gated by an eval suite before it ships.
  • Provider-neutral. OpenAI, Anthropic, Google, Mistral, any OpenAI-compatible endpoint, or a deterministic offline mock — behind one interface.
  • One coherent model from input to output, instead of a bag of loosely-coupled utilities.

Install

pip install vincio                  # core — runs fully offline with the mock provider
pip install "vincio[openai]"        # + OpenAI provider
pip install "vincio[anthropic]"     # + Anthropic provider
pip install "vincio[all]"           # every optional integration

Python 3.11+. Core dependencies are just pydantic, httpx, pyyaml, and typing-extensions; every heavy integration (vector stores, OCR, server, OpenTelemetry, …) is an opt-in extra.

60-second quickstart

from vincio import ContextApp

app = ContextApp(name="docs_qa")
app.add_source("docs", path="./docs", retrieval="hybrid")
app.set_policy("answer_only_from_sources", True)

result = app.run("How do I configure SSO?")
print(result.output)      # the grounded answer
print(result.citations)   # evidence the answer actually cited
print(result.trace_id)    # every run produces a full trace
print(result.cost_usd)    # …and a cost

No API key? It runs offline out of the box on a deterministic mock provider that emits schema-valid output — so your whole pipeline (retrieval, validation, evals, traces) runs for real in CI.

Typed output

from pydantic import BaseModel
from vincio import ContextApp

class TicketClassification(BaseModel):
    label: str
    confidence: float
    reason: str

app = ContextApp(name="triage", output_schema=TicketClassification)
result = app.run("The dashboard crashes after login")

result.output.label        # → a validated TicketClassification instance

Agents with tools and memory

app = ContextApp(name="support_refunds", output_schema=RefundDecision)
app.add_memory(scope="user", strategy="semantic")
app.add_tool("billing_lookup", permissions=["billing:read"])
app.add_tool("refund_create", permissions=["billing:write"], approval_required=True)

agent = app.agent(max_steps=6)
result = agent.run("Customer asks for a refund on invoice INV-123.")

Evaluation as a gate

from vincio.evals import Dataset, EvalRunner

dataset = Dataset.load("golden/support_triage.jsonl")
report = EvalRunner(app).run(dataset)
report.print_summary()     # groundedness, citation accuracy, schema validity, cost — with CI exit codes

Features

Vincio is organized into composable subsystems. Use the high-level ContextApp runtime, or reach for any engine directly.

Subsystem What it does
Prompt compiler Typed prompt ASTs with ${variables}, lint rules, cache-aware stable-prefix layout, versioning, hashing, diffing, variant generation.
Context compiler Scores every candidate (relevance, novelty, authority, freshness, provenance, token cost, leakage risk), deduplicates, resolves conflicts, compresses, and packs to a token budget — with an excluded-context report explaining every omission.
Retrieval (RAG) Hybrid BM25 + dense retrieval, query planning with subqueries, rerankers, entity-graph and multi-hop retrieval, reasoning retrieval that reports missing fact types, citations.
Memory Layered (session → episodic → semantic → tenant → graph) with a guarded write pipeline, confidence decay, contradiction resolution, and privacy scoping.
Tools Permissioned registry (RBAC scopes + ABAC rules), schema derivation from type hints, sandboxing, reliability scoring, idempotent write-action guardrails with approval callbacks.
Agents Bounded DAG execution with planners (direct / static / dynamic / ReAct / plan-and-execute), critics, validators, human gates, and hard budget enforcement.
Workflows Deterministic DAGs with retries, branching, parallelism, compensation, and approval gates.
Structured output Pydantic output contracts, robust parsers (fenced / embedded / lenient / streaming JSON), a validation pipeline, and principled repair that fixes structure only — never invents facts.
Evaluation Golden JSONL datasets, 17+ task / grounding / retrieval / operational metrics, deterministic and model judges, regression gates, and baseline-diff reports.
Optimization Prompt / context / routing / cache search driven by an eval-fitness function, with safety-gated promotion that blocks any candidate regressing schema validity or safety.
Observability Every run yields a full trace span tree; JSONL and OpenTelemetry exporters; per-run cost tracking.
Security Deterministic PII / secret detection and redaction, prompt-injection defense, RBAC / ABAC, tenant isolation, and a hash-chained audit log.
Storage Pluggable metadata (in-memory / SQLite / Postgres), blob, analytics (DuckDB), vector (Qdrant / pgvector), and graph (Neo4j) backends behind one factory.
Providers OpenAI, Anthropic, Google, Mistral, any OpenAI-compatible endpoint, and a deterministic offline mock — all async-first with sync wrappers.

Every extension point — providers, metrics, chunkers, rerankers, judges, validators, tools — accepts your own implementation via a registry.

Benchmarks

VincioBench ships in benchmarks/ and runs fully offline (deterministic provider + deterministic metrics) so results are reproducible. Each family compares the Vincio pipeline against a naive baseline. Representative results on the bundled reference corpus:

Family Metric Vincio Naive baseline
Context compression evidence tokens for the same task 216 1,175 (stuff-everything)
→ token reduction −81.6%
Output recovery malformed model outputs successfully parsed 5 / 5 3 / 5 (json.loads)
Security prompt-injection detection rate 100%
injection false-positive rate 0%
PII coverage 100%
Retrieval recall@3 / MRR (known-answer corpus) 1.00 / 1.00
Memory preference recall · contradiction supersede · tenant isolation pass
Tools runtime overhead, p50 0.02 ms
Agents adversarial infinite-loop model bounded (budget) unbounded

Honest by design. These numbers come from a small, synthetic offline corpus and are meant to demonstrate the mechanisms, not to be quoted as universal gains. The context-compression hypothesis (a 20–40% reduction target) is measured per run, and VincioBench reports whether it was met on your data. Run python benchmarks/vinciobench.py against your own corpus — and trust only what that prints. See benchmarks/README.md.

How Vincio compares

Each ecosystem below is broad and capable in its own focus area. The table reflects built-in, in-library capabilities — not what is reachable by bolting on a separate product or SaaS.

Capability Vincio LangChain LlamaIndex DSPy Ragas
Scored, budgeted context compiler
Typed prompt AST + lint + cache layout
Hybrid (BM25 + dense) RAG
Layered memory (decay, conflicts, scopes)
Permissioned tool registry (RBAC/ABAC)
Bounded agents + deterministic workflows
Structured output + structure-only repair
Built-in evals + CI gates
Eval-driven optimization (gated promotion)
Native tracing + cost, no account needed
Deterministic security (PII / injection / audit)

✅ first-class in-library · ➖ partial or via a separate add-on/SaaS · ❌ not a focus. Reflects mid-2026; ecosystems evolve. Vincio is built to interoperate — wrap a LangChain component as a tool, feed LlamaIndex-parsed documents into a source, use a DSPy program as a provider, or register a Ragas metric with @register_metric. See the in-depth write-ups in docs/comparisons/.

Use cases

You want to… Reach for Example
Classify and route support tickets into typed labels typed output 01_support_triage.py
Answer questions over your docs with real citations hybrid RAG + grounding policy 02_document_qa.py
Review contracts clause-by-clause end-to-end context app 03_contract_review.py
Extract structured fields from invoices structured extraction + F1 eval 04_invoice_extraction.py
Build a research agent with bounded budgets ReAct agent + tools 05_research_agent.py
Automate a CRM agent with approval-gated writes memory + permissioned tools 06_crm_agent.py
Ask questions over a codebase code-aware chunking + import graph 07_codebase_qa.py
Analyze spreadsheets with schema awareness table chunking + quality checks 08_spreadsheet_analysis.py
Gate quality in CI datasets, gates, baseline diff 09_eval_pipeline.py
Tune prompts/context against an eval suite optimization + gated promotion 10_optimization_run.py

More examples

All ten examples in examples/ run fully offline with no API keys. Point them at a real model with environment variables:

export VINCIO_PROVIDER=openai VINCIO_MODEL=gpt-5.2-mini OPENAI_API_KEY=sk-...
cd examples && python 02_document_qa.py

Command line

vincio init my-project           # scaffold config, a starter app, and a golden dataset
vincio run app.py --input "..."  # run an app
vincio eval run golden.jsonl     # run an eval suite (with CI gates and baseline compare)
vincio prompt lint prompts/      # lint prompt specs
vincio trace show trace_123      # inspect a run's full trace
vincio optimize run --target groundedness
vincio index build ./docs        # build a retrieval index
vincio memory inspect --user u1  # inspect a user's memory

A FastAPI server (API-key + JWT auth, SSE streaming) is available via from vincio.server import create_app — see docs/reference/api.md.

Architecture

                         ┌──────────────────────────────────────────────┐
   user input  ─────────▶│  Input engine   normalize · classify · scope  │
                         └───────────────┬──────────────────────────────┘
                                         ▼
        ┌──────────────┐        ┌────────────────┐        ┌──────────────┐
        │   Memory     │───────▶│    CONTEXT     │◀───────│  Retrieval   │
        │  L0…L5       │        │   COMPILER     │        │  hybrid RAG  │
        └──────────────┘        │ score·dedupe·  │        └──────────────┘
        ┌──────────────┐        │ conflict·      │        ┌──────────────┐
        │    Tools     │───────▶│ compress·budget│◀───────│   Prompt     │
        │ permissioned │        └───────┬────────┘        │  compiler    │
        └──────────────┘                ▼                 └──────────────┘
                              ┌────────────────────┐
                              │   Model execution  │   provider-neutral
                              └─────────┬──────────┘
                                        ▼
                    ┌─────────────────────────────────────────┐
                    │ Output validation · Evals · Security ·   │
                    │ Trace + cost · Memory write-back         │
                    └─────────────────────────────────────────┘

See AGENTS.md for the package layout and docs/concepts/ for a tour of each engine.

Roadmap

Vincio 0.1.0 ships every in-scope subsystem above, with 195 offline tests, ten runnable examples, and full documentation. The public roadmap — what's shipped, what's next, and what's intentionally out of scope — lives in ROADMAP.md.

Vincio is, and stays, a library. The building blocks for production operation (audit chain, retention, tenant isolation, RBAC/ABAC, a server) ship in the package for you to deploy on your own infrastructure. Hosted services and managed control planes are not part of this project.

Documentation

Contributing

Contributions are welcome. The test suite runs fully offline in a couple of seconds and must stay green:

pip install -e ".[dev]"
python -m pytest tests/ -q     # 195 tests, no network or API keys required
ruff check vincio/ tests/

See AGENTS.md for the codebase layout and engineering conventions.

License

Apache License 2.0 © Vincio Contributors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vincio-0.1.0.tar.gz (251.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vincio-0.1.0-py3-none-any.whl (242.6 kB view details)

Uploaded Python 3

File details

Details for the file vincio-0.1.0.tar.gz.

File metadata

  • Download URL: vincio-0.1.0.tar.gz
  • Upload date:
  • Size: 251.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vincio-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c5319ea3a05650dc82c5d5c38d2c45c40b853a0c00ac73724cffdced8e29dfac
MD5 2388fff4d7682035ca2cf760eb0f6013
BLAKE2b-256 e5ae469c03da8914af68219996d3bd71ad3650e27d3fdd9fc55c8ec0defd3f2d

See more details on using hashes here.

Provenance

The following attestation bundles were made for vincio-0.1.0.tar.gz:

Publisher: release.yml on Ohswedd/vincio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vincio-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vincio-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 242.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vincio-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0599422c5b67f4b03c0159d2f9b0873e7e290f6b6b10a137abaf4b4495566c69
MD5 3f2087a98d1080893df7a9fbff50679a
BLAKE2b-256 7d3f68d9e48d5c248728197733f4964f6bd4396ff2bf031673c4a45a25322ccc

See more details on using hashes here.

Provenance

The following attestation bundles were made for vincio-0.1.0-py3-none-any.whl:

Publisher: release.yml on Ohswedd/vincio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page