Runtime for executing reusable AI agent skills and capability bindings
Project description
Agent Skills Runtime
Agents should execute whenever possible.
A deterministic, binding-driven execution engine for composable AI agent skills.
Agent Skills Runtime lets you define agent capabilities as abstract contracts, wire them to any backend (Python, OpenAPI, MCP, OpenRPC), and execute multi-step workflows as declarative DAGs โ with built-in safety gates, cognitive state tracking, and full observability.
No API keys required. 122 capabilities ship with deterministic Python baselines. Install, run your first skill in under 3 minutes.
๐งฉ Mental Model
Think of Agent Skills as:
- APIs โ turned into reusable โcapabilitiesโ
- Workflows (Zapier / Airflow) โ turned into โskillsโ
- Agent reasoning โ made explicit via structured state
In short:
Agent Skills lets agents execute structured workflows over tools, instead of guessing what to do via prompts.
Table of Contents
- Mental Model
- Introducing ORCA
- Why Agent Skills?
- When should you use Agent Skills?
- Architecture
- How it compares
- Quick Start
- Advanced Features
- Documentation
- Contributing
- License
- Citing
๐ง Introducing ORCA
Agent Skills Runtime is a reference implementation of ORCA โ an emerging standard for structured agent execution.
ORCA (Open Cognitive Runtime Architecture) defines a Cognitive Execution Layer where agents do not act through prompts, but through composable, contract-driven processes.
Unlike traditional agent frameworks that rely on implicit reasoning inside LLMs, ORCA externalizes cognition into:
- Structured state (CognitiveState) โ explicit, inspectable reasoning
- Capabilities (contracts) โ reusable, binding-agnostic operations
- Skills (execution graphs) โ deterministic, composable workflows
- Built-in safety โ enforced execution constraints and validation
This shifts agent systems from:
- prompt-driven behavior
โ to - execution-driven systems
๐น Core Principles of ORCA
- Execution over prompting
- Explicit state over implicit context
- Contracts over conventions
- Separation of intent and execution
- Safety as a first-class concern
๐น Learn more
See the full ORCA specification:
๐ ORCA.md
Why Agent Skills?
| Problem | How Agent Skills solves it |
|---|---|
| Tools are coupled to one framework | Binding abstraction โ same capability, 4 protocols (PythonCall, OpenAPI, MCP, OpenRPC) |
| Workflows are imperative code | Declarative YAML skills โ steps, dependencies, mappings resolved by the runtime |
| No safety model | 4-tier safety gates โ trust levels, confirmation prompts, scope constraints, side-effect tracking |
| No structured reasoning state | CognitiveState v1 โ typed Frame/Working/Output/Trace aligned with CoALA |
| Inconsistent naming | Controlled vocabulary โ 122 capabilities across 27 domains with governed naming |
| Hard to observe | OTel + metrics + audit โ hash-chain audit trail, Prometheus metrics, SSE streaming |
๐ค When should you use Agent Skills?
Use it if you need:
- Deterministic and reproducible agent behavior
- Safe interaction with real systems (APIs, databases, etc.)
- Reusable workflows instead of prompt engineering
- Observability and auditability of agent execution
Avoid it if:
- You just need a quick prompt-based prototype
- You donโt need control over execution or safety
Architecture
Note: The diagram below uses Mermaid. It renders natively on GitHub. If viewing on PyPI or another platform, see the architecture diagram on GitHub.
graph TB
subgraph "Developer Interface"
CLI["CLI<br/>agent-skills run / describe / scaffold / test"]
HTTP["HTTP API<br/>REST + SSE streaming"]
SDK["SDKs<br/>Python ยท TypeScript ยท LangChain ยท CrewAI ยท AutoGen ยท SemanticKernel"]
MCP_SERVER["MCP Server<br/>stdio + SSE transport"]
LLM_NATIVE["Native LLM Adapters<br/>Anthropic ยท OpenAI ยท Gemini"]
end
subgraph "Gateway Layer"
GW["Skill Gateway<br/>Discovery ยท Ranking ยท Governance"]
end
subgraph "Execution Engine"
SCHED["DAG Scheduler<br/>Kahn's topological sort ยท parallel / sequential"]
POLICY["Policy Engine<br/>Safety gates ยท Trust levels ยท Confirmation"]
COGSTATE["CognitiveState v1<br/>Frame ยท Working ยท Output ยท Trace"]
end
subgraph "Binding Layer"
BR["Binding Resolver<br/>Protocol routing ยท Fallback chain ยท Conformance"]
PC["PythonCall"]
OA["OpenAPI"]
MCP_P["MCP"]
RPC["OpenRPC"]
end
subgraph "Services"
BASELINE["Python Baselines<br/>122 deterministic functions"]
EXTERNAL["External APIs<br/>OpenAI ยท custom services"]
MCP_SRV["MCP Servers<br/>In-process + subprocess"]
end
CLI --> GW
HTTP --> GW
SDK --> HTTP
MCP_SERVER --> GW
LLM_NATIVE --> GW
GW --> SCHED
SCHED --> POLICY
POLICY --> BR
SCHED -.-> COGSTATE
BR --> PC --> BASELINE
BR --> OA --> EXTERNAL
BR --> MCP_P --> MCP_SRV
BR --> RPC --> EXTERNAL
How it compares
| Dimension | Agent Skills | LangGraph | SemanticKernel | OpenAI SDK | CrewAI |
|---|---|---|---|---|---|
| DAG Execution | โ Kahn sort | โ StateGraph | โ ๏ธ Linear | โ ๏ธ Tool-loop | โ ๏ธ Sequential |
| Multi-Protocol Bindings | โ 4 protocols | โ Python only | โ ๏ธ HTTP+plugins | โ Function only | โ Function only |
| Safety Model | โ 4-tier gates | โ None | โ ๏ธ Basic | โ Minimal | โ None |
| Cognitive State | โ Typed (CoALA) | โ No formal | โ No formal | โ No formal | โ ๏ธ Roles |
| Capability Registry | โ 122 governed | โ None | โ ๏ธ Plugin store | โ None | โ ๏ธ Templates |
| Observability | โ OTel+Metrics+Audit | โ LangSmith | โ ๏ธ AppInsights | โ ๏ธ Log-only | โ ๏ธ Basic |
| Zero-config local run | โ Python baselines | โ ๏ธ Needs LLM key | โ ๏ธ Needs Azure | โ Needs API key | โ ๏ธ Needs LLM key |
| Declarative workflows | โ YAML skills | โ ๏ธ Python code | โ ๏ธ C# code | โ Imperative | โ ๏ธ Python code |
| Checkpoint/Restore | โ Full state | โ Checkpoints | โ State | โ Stateless | โ ๏ธ Memory |
Quick Start
Install from PyPI
pip install orca-agent-skills # core
pip install orca-agent-skills[all] # + PDF, web, OTel extras
pip install orca-agent-skills[mcp] # + MCP server/client
pip install orca-agent-skills[dev] # + pytest, ruff, benchmarks
The PyPI package includes the execution engine and CLI. You'll also need the companion agent-skill-registry (capability contracts, skills, vocabulary):
git clone https://github.com/gfernandf/agent-skill-registry.git
agent-skills doctor # verifies registry is found
Install from source
git clone https://github.com/gfernandf/agent-skills.git
cd agent-skills
make bootstrap # clones registry alongside, installs deps
agent-skills doctor # all checks should pass
What
make bootstrapdoes: clones the registry into../agent-skill-registry/, then runspip install -e ".[all,dev]". If you prefer manual setup, see docs/INSTALLATION.md.
Run your first skill
agent-skills run text.summarize-plain-input \
--input '{"text": "Agent Skills Runtime is a deterministic execution engine for composable AI agent skills. It supports four binding protocols and ships with 122 Python baselines.", "max_length": 50}'
Expected output:
{
"summary": "Agent Skills Runtime is a deterministic execution engine...",
"sentiment": "positive"
}
Run via HTTP
agent-skills serve # starts server on :8080
curl http://localhost:8080/v1/health # health check
curl -X POST http://localhost:8080/v1/skills/text.summarize-plain-input/execute \
-H "Content-Type: application/json" \
-d '{"inputs": {"text": "Hello world", "max_length": 20}}'
Baseline โ LLM: same skill, two modes
Every capability ships with a deterministic Python baseline. Set OPENAI_API_KEY to upgrade to LLM-powered execution โ zero code changes.
# 1. Baseline mode (no API key, pure Python)
agent-skills run text.summarize-plain-input \
--input '{"text": "Agent Skills decouples capability contracts from execution backends.", "max_length": 30}'
# โ {"summary": "Agent Skills decouples capability contracts from exec..."}
# 2. LLM mode (set key, same command)
export OPENAI_API_KEY=sk-...
agent-skills run text.summarize-plain-input \
--input '{"text": "Agent Skills decouples capability contracts from execution backends.", "max_length": 30}'
# โ {"summary": "Agent Skills separates capability definitions from their runtime implementations."}
The binding resolver picks the best available backend automatically:
- No key โ
PythonCallbaseline (deterministic, offline, fast) - Key set โ
OpenAPIbinding to OpenAI (richer output, higher latency)
This means your CI stays green without API keys, and production gets LLM quality โ from the same skill YAML.
See docs/INSTALLATION.md for full setup instructions, optional extras, and environment variable reference.
Use with LangChain / LangGraph
from sdk.embedded import as_langchain_tools
# No server needed โ runs in-process
tools = as_langchain_tools(["text.content.summarize", "text.content.translate"])
# Pass tools to any LangChain AgentExecutor or LangGraph node
agent = create_react_agent(llm, tools)
Adapters are also available for CrewAI, AutoGen, and Semantic Kernel โ see the sdk/ directory.
Use as MCP Server
Expose all 122 capabilities as MCP tools over stdio (or SSE) โ any MCP-compatible host (Claude Desktop, VS Code Copilot, etc.) can discover and call them:
# stdio transport (default โ for Claude Desktop / MCP hosts)
python -m official_mcp_servers
# SSE transport (for network clients)
python -m official_mcp_servers --sse --host 0.0.0.0 --port 8765
# Or via the CLI
agent-skills mcp-serve
agent-skills mcp-serve --sse --port 8765
Requires the mcp extra: pip install -e ".[mcp]"
Native LLM Tool Definitions
Generate provider-native tool arrays for Anthropic, OpenAI, and Gemini โ no HTTP server, no adapters, just the format each SDK expects:
from sdk.embedded import (
as_anthropic_tools, execute_anthropic_tool_call,
as_openai_tools, execute_openai_tool_call,
as_gemini_tools, execute_gemini_tool_call,
)
# โโ Anthropic โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
tools = as_anthropic_tools(["text.content.summarize"])
response = client.messages.create(model="claude-sonnet-4-20250514", tools=tools, ...)
result = execute_anthropic_tool_call(block.name, block.input)
# โโ OpenAI โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
tools = as_openai_tools() # all 122 capabilities
response = openai.chat.completions.create(model="gpt-4o", tools=tools, ...)
result = execute_openai_tool_call(call.function.name, call.function.arguments)
# โโ Gemini โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
tools = as_gemini_tools(["data.schema.validate"])
response = model.generate_content(contents, tools=tools)
result = execute_gemini_tool_call(fc.name, fc.args)
Each execute_* helper maps the underscore tool name back to the dotted capability ID and returns a JSON string.
Choosing your integration mode
| Mode | Best for | Latency | Requires server? |
|---|---|---|---|
Embedded SDK (sdk.embedded) |
Python apps, notebooks, scripts | Lowest (in-process) | No |
Native LLM tools (as_anthropic_tools, etc.) |
Direct Anthropic/OpenAI/Gemini integration | Low (in-process) | No |
LangChain / CrewAI / AutoGen (as_langchain_tools, etc.) |
Framework-based agents | Low (in-process) | No |
MCP Server (python -m official_mcp_servers) |
Claude Desktop, VS Code Copilot, MCP hosts | Low (stdio/SSE) | MCP host |
HTTP REST (agent-skills serve) |
Microservices, non-Python clients, multi-tenant | Medium (network) | Yes |
Start here: If you're writing Python, use the embedded SDK โ zero setup, lowest latency. Switch to HTTP only when you need network access or non-Python clients.
License
Apache 2.0 โ see LICENSE.
Citing
If you use Agent Skills in your research, please cite:
@software{fernandez_agent_skills_2026,
author = {Fernandez Alvarez, Guillermo},
title = {Agent Skills Runtime},
year = {2026},
url = {https://github.com/gfernandf/agent-skills},
version = {0.1.0},
license = {Apache-2.0}
}
GitHub also provides a "Cite this repository" button powered by CITATION.cff.
Advanced Features
| Feature | Description | Docs |
|---|---|---|
| NL Autopilot | agent-skills ask "summarize this" โ discovers, maps, executes |
SKILL_AUTHORING.md |
| Dev Watch | Hot-reload skill development with agent-skills dev |
SKILL_AUTHORING.md |
| Skill Triggers | Declarative webhook / event / file-change triggers | WEBHOOKS.md |
| Benchmark Lab | Compare binding protocols side-by-side | CLI: agent-skills benchmark-lab |
| Compose DSL | Compact .compose text syntax for workflows |
CLI: agent-skills compose |
| Showcase | One-command shareable markdown for any skill | CLI: agent-skills showcase |
| Local Capabilities | Custom capabilities via .agent-skills/capabilities/ with extends |
SKILL_AUTHORING.md |
| Auth & RBAC | 4 hierarchical roles, API key + JWT, pluggable | AUTH.md |
| Webhooks | HMAC-signed event payloads with auto-retry | WEBHOOKS.md |
| Plugin System | Entry-point based auth, invoker, and binding-source plugins | PLUGINS.md |
| Audit Trail | Hash-chain audit with off/standard/full modes |
OBSERVABILITY.md |
| CognitiveState v1 | Typed Frame/Working/Output/Trace aligned with CoALA | COGNITIVE_STATE_V1.md |
| JSON Schemas | 16 schemas (2020-12) for capabilities, skills, bindings | JSON_SCHEMAS.md |
| Governance Catalog | Skill lifecycle: draft โ validated โ trusted โ recommended | SKILL_GOVERNANCE_MANIFESTO.md |
| Binding Conformance | strict/standard/experimental profiles per binding |
CONSUMER_FACING_NEUTRAL_API.md |
| Binding Fallback | Deterministic fallback chain with terminal baseline | RUNNER_GUIDE.md |
Skill Authoring
# LLM-powered wizard โ generates a complete skill from a plain-language goal
# Requires OPENAI_API_KEY (see .env.example)
export OPENAI_API_KEY=sk-...
agent-skills scaffold --wizard # LLM proposes workflow + YAML
agent-skills scaffold "Summarize a PDF" # one-line from intent
agent-skills scaffold --wizard --dry-run # preview without saving
# Without OPENAI_API_KEY, the wizard falls back to manual interactive mode
# (you pick inputs, outputs, and capabilities yourself)
agent-skills test text.summarize # auto-fixture tests
agent-skills describe text.summarize --mermaid # DAG diagram
agent-skills export text.summarize # portable bundle
agent-skills contribute text.summarize # promotion pipeline
See docs/SKILL_AUTHORING.md for the full workflow guide.
Documentation
| Topic | Link |
|---|---|
| 10-minute onboarding | ONBOARDING_10_MIN.md |
| Installation & setup | INSTALLATION.md |
| Environment variables | ENVIRONMENT_VARIABLES.md |
| Error taxonomy | ERROR_TAXONOMY.md |
| Runner architecture | RUNNER_GUIDE.md |
| DAG scheduler | SCHEDULER.md |
| Step control flow | STEP_CONTROL_FLOW.md |
| Streaming SSE | STREAMING.md |
| Async execution | ASYNC_EXECUTION.md |
| Deployment & Docker | DEPLOYMENT.md |
| Observability & OTel | OBSERVABILITY.md |
| Authentication & RBAC | AUTH.md |
| Security | SECURITY.md |
| OpenAPI foundation | OPENAPI_PHASE0_FOUNDATION.md |
| MCP integration | MCP_INTEGRATION_SLICES.md |
| Governance manifesto | SKILL_GOVERNANCE_MANIFESTO.md |
| Project status | PROJECT_STATUS.md |
Full documentation is served with MkDocs: make serve โ http://localhost:8000
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
make check # lint + format + tests in one command
Troubleshooting
| Problem | Solution |
|---|---|
RegistryNotFound |
Run agent-skills doctor --fix to auto-clone the registry |
| Skill returns unexpected error | See Error Taxonomy for frozen error codes |
| Environment config issues | Check Environment Variables |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file orca_agent_skills-0.1.0.tar.gz.
File metadata
- Download URL: orca_agent_skills-0.1.0.tar.gz
- Upload date:
- Size: 454.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14c0438400bbc349fecc13ed10fdaf6b37d5d08e441c2f71060650bdd2f41653
|
|
| MD5 |
65aa8129edec3270ee658e1d4ff492ba
|
|
| BLAKE2b-256 |
9975d4c734bf312e425307d7553a7740fab4a2df57accaadc436994c937a2e51
|
File details
Details for the file orca_agent_skills-0.1.0-py3-none-any.whl.
File metadata
- Download URL: orca_agent_skills-0.1.0-py3-none-any.whl
- Upload date:
- Size: 333.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73104188748d58eeb9c06edadd808bfcf7570081e794f6fde5753afbc447bc1c
|
|
| MD5 |
53dd552b6c6a5e84c97faf6ee57101d0
|
|
| BLAKE2b-256 |
cf2cc3eaf6928bd2f2902bd07387148ca899b0d154f00af77f70d978b0438221
|