Runtime for executing reusable AI agent skills and capability bindings
Project description
Agent Skills Runtime
Agents should execute whenever possible.
A deterministic, binding-driven execution engine for composable AI agent skills.
Agent Skills Runtime lets you define agent capabilities as abstract contracts, wire them to any backend (Python, OpenAPI, MCP, OpenRPC), and execute multi-step workflows as declarative DAGs — with built-in safety gates, cognitive state tracking, and full observability.
No API keys required. 141 capabilities ship with deterministic Python baselines. Install, run your first skill in under 3 minutes.
⚡ 30-second start
Works on macOS, Linux, and Windows. No API key required.
# 1 — clone and install
git clone https://github.com/gfernandf/agent-skills.git
cd agent-skills
pip install -e .
# 2 — get the capability registry (clone it alongside agent-skills/)
git clone https://github.com/gfernandf/agent-skill-registry.git ../agent-skill-registry
# 3 — verify everything is wired
python skills.py doctor
# 4 — run your first skill (no API key needed)
# macOS / Linux / Git Bash:
python skills.py run text.summarize-plain-input \
--input '{"text": "ORCA decouples agent reasoning from execution. Skills are DAGs, not prompt chains.", "max_length": 20}' \
2>/dev/null
Windows PowerShell alternative for step 4
'{ "text": "ORCA decouples agent reasoning from execution.", "max_length": 20 }' | Set-Content input_qs.json -Encoding ascii
python skills.py run text.summarize-plain-input --input-file input_qs.json 2>$null
Remove-Item input_qs.json
Expected output:
{"summary": "ORCA decouples agent reasoning from execution..."}
No API key. No server. Pure Python baseline, runs offline. → Full Quick Start · → Use with LangChain / OpenAI / Gemini / MCP
🧩 Mental Model
Think of Agent Skills as:
- APIs → turned into reusable “capabilities”
- Workflows (Zapier / Airflow) → turned into “skills”
- Agent reasoning → made explicit via structured state
In short:
Agent Skills lets agents execute structured workflows over tools, instead of guessing what to do via prompts.
Table of Contents
- Mental Model
- Introducing ORCA
- Why Agent Skills?
- When should you use Agent Skills?
- Architecture
- How it compares
- Quick Start
- Advanced Features
- Documentation
- Contributing
- License
- Citing
🧠 Introducing ORCA
Agent Skills Runtime is a reference implementation of ORCA — an emerging standard for structured agent execution.
ORCA (Open Cognitive Runtime Architecture) defines a Cognitive Execution Layer where agents do not act through prompts, but through composable, contract-driven processes.
Unlike traditional agent frameworks that rely on implicit reasoning inside LLMs, ORCA externalizes cognition into:
- Structured state (CognitiveState) — explicit, inspectable reasoning
- Capabilities (contracts) — reusable, binding-agnostic operations
- Skills (execution graphs) — deterministic, composable workflows
- Built-in safety — enforced execution constraints and validation
This shifts agent systems from:
- prompt-driven behavior
→ to - execution-driven systems
🔹 Core Principles of ORCA
- Execution over prompting
- Explicit state over implicit context
- Contracts over conventions
- Separation of intent and execution
- Safety as a first-class concern
🔹 Learn more
See the full ORCA specification:
👉 ORCA.md
Why Agent Skills?
| Problem | How Agent Skills solves it |
|---|---|
| Tools are coupled to one framework | Binding abstraction — same capability, 4 protocols (PythonCall, OpenAPI, MCP, OpenRPC) |
| Workflows are imperative code | Declarative YAML skills — steps, dependencies, mappings resolved by the runtime |
| No safety model | 4-tier safety gates — trust levels, confirmation prompts, scope constraints, side-effect tracking |
| No structured reasoning state | CognitiveState v1 — typed Frame/Working/Output/Trace aligned with CoALA |
| Inconsistent naming | Controlled vocabulary — 122 capabilities across 27 domains with governed naming |
| Hard to observe | OTel + metrics + audit — hash-chain audit trail, Prometheus metrics, SSE streaming |
🤔 When should you use Agent Skills?
Use it if you need:
- Deterministic and reproducible agent behavior
- Safe interaction with real systems (APIs, databases, etc.)
- Reusable workflows instead of prompt engineering
- Observability and auditability of agent execution
Avoid it if:
- You just need a quick prompt-based prototype
- You don’t need control over execution or safety
Architecture
Note: The diagram below uses Mermaid. It renders natively on GitHub. If viewing on PyPI or another platform, see the architecture diagram on GitHub.
graph TB
subgraph "Developer Interface"
CLI["CLI<br/>agent-skills run / describe / scaffold / test"]
HTTP["HTTP API<br/>REST + SSE streaming"]
SDK["SDKs<br/>Python · TypeScript · LangChain · CrewAI · AutoGen · SemanticKernel"]
MCP_SERVER["MCP Server<br/>stdio + SSE transport"]
LLM_NATIVE["Native LLM Adapters<br/>Anthropic · OpenAI · Gemini"]
end
subgraph "Gateway Layer"
GW["Skill Gateway<br/>Discovery · Ranking · Governance"]
end
subgraph "Execution Engine"
SCHED["DAG Scheduler<br/>Kahn's topological sort · parallel / sequential"]
POLICY["Policy Engine<br/>Safety gates · Trust levels · Confirmation"]
COGSTATE["CognitiveState v1<br/>Frame · Working · Output · Trace"]
end
subgraph "Binding Layer"
BR["Binding Resolver<br/>Protocol routing · Fallback chain · Conformance"]
PC["PythonCall"]
OA["OpenAPI"]
MCP_P["MCP"]
RPC["OpenRPC"]
end
subgraph "Services"
BASELINE["Python Baselines<br/>122 deterministic functions"]
EXTERNAL["External APIs<br/>OpenAI · custom services"]
MCP_SRV["MCP Servers<br/>In-process + subprocess"]
end
CLI --> GW
HTTP --> GW
SDK --> HTTP
MCP_SERVER --> GW
LLM_NATIVE --> GW
GW --> SCHED
SCHED --> POLICY
POLICY --> BR
SCHED -.-> COGSTATE
BR --> PC --> BASELINE
BR --> OA --> EXTERNAL
BR --> MCP_P --> MCP_SRV
BR --> RPC --> EXTERNAL
How it compares
| Dimension | Agent Skills | LangGraph | SemanticKernel | OpenAI SDK | CrewAI |
|---|---|---|---|---|---|
| DAG Execution | ✅ Kahn sort | ✅ StateGraph | ⚠️ Linear | ⚠️ Tool-loop | ⚠️ Sequential |
| Multi-Protocol Bindings | ✅ 4 protocols | ❌ Python only | ⚠️ HTTP+plugins | ❌ Function only | ❌ Function only |
| Safety Model | ✅ 4-tier gates | ❌ None | ⚠️ Basic | ❌ Minimal | ❌ None |
| Cognitive State | ✅ Typed (CoALA) | ❌ No formal | ❌ No formal | ❌ No formal | ⚠️ Roles |
| Capability Registry | ✅ 122 governed | ❌ None | ⚠️ Plugin store | ❌ None | ⚠️ Templates |
| Observability | ✅ OTel+Metrics+Audit | ✅ LangSmith | ⚠️ AppInsights | ⚠️ Log-only | ⚠️ Basic |
| Zero-config local run | ✅ Python baselines | ⚠️ Needs LLM key | ⚠️ Needs Azure | ❌ Needs API key | ⚠️ Needs LLM key |
| Declarative workflows | ✅ YAML skills | ⚠️ Python code | ⚠️ C# code | ❌ Imperative | ⚠️ Python code |
| Checkpoint/Restore | ✅ Full state | ✅ Checkpoints | ✅ State | ❌ Stateless | ⚠️ Memory |
Quick Start
Install from PyPI
Note: The package is currently published on PyPI as
agent-skills. The rename toorca-agent-skillswill be effective with the next PyPI release. Forv1.0.0features, use Install from source below.
pip install agent-skills # core (current PyPI name)
pip install agent-skills[all] # + PDF, web, OTel extras
pip install agent-skills[mcp] # + MCP server/client
pip install agent-skills[dev] # + pytest, ruff, benchmarks
The PyPI package includes the execution engine and CLI. You'll also need the companion agent-skill-registry (capability contracts, skills, vocabulary):
git clone https://github.com/gfernandf/agent-skill-registry.git
python skills.py doctor # verifies registry is found
Windows note: After
pip install, ifagent-skillsis not found in your shell, add Python'sScripts/directory to your PATH, or usepython skills.pyfrom the repo root as a drop-in replacement for allagent-skillscommands throughout this README.
Install from source
macOS / Linux:
git clone https://github.com/gfernandf/agent-skills.git
cd agent-skills
make bootstrap # clones registry alongside, installs deps
python skills.py doctor
Windows (PowerShell):
git clone https://github.com/gfernandf/agent-skills.git
cd agent-skills
pip install -e ".[all,dev]"
git clone https://github.com/gfernandf/agent-skill-registry.git ../agent-skill-registry
python skills.py doctor
What
make bootstrapdoes: clones the registry into../agent-skill-registry/, then runspip install -e ".[all,dev]". If you prefer manual setup, see docs/INSTALLATION.md.
Run your first skill
# macOS / Linux / Git Bash (stderr suppressed for clean output):
python skills.py run text.summarize-plain-input \
--input '{"text": "Agent Skills Runtime is a deterministic execution engine for composable AI agent skills. It supports four binding protocols and ships with 141 Python baselines.", "max_length": 50}' \
2>/dev/null
Windows PowerShell
'{ "text": "Agent Skills Runtime is a deterministic execution engine.", "max_length": 50 }' | Set-Content input_run.json -Encoding ascii
python skills.py run text.summarize-plain-input --input-file input_run.json 2>$null
Remove-Item input_run.json
Expected output:
{
"summary": "Agent Skills Runtime is a deterministic execution engine for composable AI agent skills."
}
Tip: Trace/telemetry events are written to stderr. Redirect with
2>/dev/null(bash) or2>$null(PowerShell) for clean output. Use--audit-mode offto disable audit records entirely.
Run via HTTP
agent-skills serve # starts server on :8080
curl http://localhost:8080/v1/health # health check
curl -X POST http://localhost:8080/v1/skills/text.summarize-plain-input/execute \
-H "Content-Type: application/json" \
-d '{"inputs": {"text": "Hello world", "max_length": 20}}'
Baseline → LLM: same skill, two modes
Every capability ships with a deterministic Python baseline. Set OPENAI_API_KEY to upgrade to LLM-powered execution — zero code changes.
# 1. Baseline mode (no API key, pure Python)
python skills.py run text.summarize-plain-input \
--input '{"text": "Agent Skills decouples capability contracts from execution backends.", "max_length": 30}' \
2>/dev/null
# → {"summary": "Agent Skills decouples capability contracts from exec..."}
# 2. LLM mode (set key, same command — zero code changes)
export OPENAI_API_KEY=sk-... # bash
# $env:OPENAI_API_KEY = "sk-..." # PowerShell
python skills.py run text.summarize-plain-input \
--input '{"text": "Agent Skills decouples capability contracts from execution backends.", "max_length": 30}' \
2>/dev/null
# → {"summary": "Agent Skills separates capability definitions from their runtime implementations."}
The binding resolver picks the best available backend automatically:
- No key →
PythonCallbaseline (deterministic, offline, fast) - Key set →
OpenAPIbinding to OpenAI (richer output, higher latency)
This means your CI stays green without API keys, and production gets LLM quality — from the same skill YAML.
See docs/INSTALLATION.md for full setup instructions, optional extras, and environment variable reference.
Use with LangChain / LangGraph
from sdk.embedded import as_langchain_tools
# No server needed — runs in-process
tools = as_langchain_tools(["text.content.summarize", "text.content.translate"])
# Pass tools to any LangChain AgentExecutor or LangGraph node
agent = create_react_agent(llm, tools)
Adapters are also available for CrewAI, AutoGen, and Semantic Kernel — see the sdk/ directory.
Use as MCP Server
Expose all 122 capabilities as MCP tools over stdio (or SSE) — any MCP-compatible host (Claude Desktop, VS Code Copilot, etc.) can discover and call them:
# stdio transport (default — for Claude Desktop / MCP hosts)
python -m official_mcp_servers
# SSE transport (for network clients)
python -m official_mcp_servers --sse --host 0.0.0.0 --port 8765
# Or via the CLI
agent-skills mcp-serve
agent-skills mcp-serve --sse --port 8765
Requires the mcp extra: pip install -e ".[mcp]"
Native LLM Tool Definitions
Generate provider-native tool arrays for Anthropic, OpenAI, and Gemini — no HTTP server, no adapters, just the format each SDK expects:
from sdk.embedded import (
as_anthropic_tools, execute_anthropic_tool_call,
as_openai_tools, execute_openai_tool_call,
as_gemini_tools, execute_gemini_tool_call,
)
# ── Anthropic ──────────────────────────────────
tools = as_anthropic_tools(["text.content.summarize"])
response = client.messages.create(model="claude-sonnet-4-20250514", tools=tools, ...)
result = execute_anthropic_tool_call(block.name, block.input)
# ── OpenAI ─────────────────────────────────────
tools = as_openai_tools() # all 122 capabilities
response = openai.chat.completions.create(model="gpt-4o", tools=tools, ...)
result = execute_openai_tool_call(call.function.name, call.function.arguments)
# ── Gemini ─────────────────────────────────────
tools = as_gemini_tools(["data.schema.validate"])
response = model.generate_content(contents, tools=tools)
result = execute_gemini_tool_call(fc.name, fc.args)
Each execute_* helper maps the underscore tool name back to the dotted capability ID and returns a JSON string.
Choosing your integration mode
| Mode | Best for | Latency | Requires server? |
|---|---|---|---|
Embedded SDK (sdk.embedded) |
Python apps, notebooks, scripts | Lowest (in-process) | No |
Native LLM tools (as_anthropic_tools, etc.) |
Direct Anthropic/OpenAI/Gemini integration | Low (in-process) | No |
LangChain / CrewAI / AutoGen (as_langchain_tools, etc.) |
Framework-based agents | Low (in-process) | No |
MCP Server (python -m official_mcp_servers) |
Claude Desktop, VS Code Copilot, MCP hosts | Low (stdio/SSE) | MCP host |
HTTP REST (agent-skills serve) |
Microservices, non-Python clients, multi-tenant | Medium (network) | Yes |
Start here: If you're writing Python, use the embedded SDK — zero setup, lowest latency. Switch to HTTP only when you need network access or non-Python clients.
License
Apache 2.0 — see LICENSE.
📄 Research Paper
Beyond Prompting: Decoupling Cognition from Execution in LLM-based Agents through the ORCA Framework
Fernandez Alvarez, G. E. (2026) · DOI: 10.5281/zenodo.19438943
The theoretical foundations of ORCA and this runtime are described in our research paper. See the full paper landing page for abstract, downloads, and citation formats.
📥 Download PDF · 📖 ORCA Specification
Citing
If you use Agent Skills or ORCA in your research, please cite the paper:
@article{fernandez_orca_2026,
author = {Fernandez Alvarez, Guillermo E.},
title = {Beyond Prompting: Decoupling Cognition from Execution in
LLM-based Agents through the ORCA Framework},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.19438943},
url = {https://doi.org/10.5281/zenodo.19438943}
}
To cite the software specifically:
@software{fernandez_agent_skills_2026,
author = {Fernandez Alvarez, Guillermo},
title = {Agent Skills Runtime},
year = {2026},
url = {https://github.com/gfernandf/agent-skills},
version = {1.0.0},
license = {Apache-2.0}
}
GitHub also provides a "Cite this repository" button powered by CITATION.cff.
Advanced Features
| Feature | Description | Docs |
|---|---|---|
| NL Autopilot | agent-skills ask "summarize this" — discovers, maps, executes |
SKILL_AUTHORING.md |
| Dev Watch | Hot-reload skill development with agent-skills dev |
SKILL_AUTHORING.md |
| Skill Triggers | Declarative webhook / event / file-change triggers | WEBHOOKS.md |
| Benchmark Lab | Compare binding protocols side-by-side | CLI: agent-skills benchmark-lab |
| Compose DSL | Compact .compose text syntax for workflows |
CLI: agent-skills compose |
| Showcase | One-command shareable markdown for any skill | CLI: agent-skills showcase |
| Local Capabilities | Custom capabilities via .agent-skills/capabilities/ with extends |
SKILL_AUTHORING.md |
| Auth & RBAC | 4 hierarchical roles, API key + JWT, pluggable | AUTH.md |
| Webhooks | HMAC-signed event payloads with auto-retry | WEBHOOKS.md |
| Plugin System | Entry-point based auth, invoker, and binding-source plugins | PLUGINS.md |
| Audit Trail | Hash-chain audit with off/standard/full modes |
OBSERVABILITY.md |
| CognitiveState v1 | Typed Frame/Working/Output/Trace aligned with CoALA | COGNITIVE_STATE_V1.md |
| JSON Schemas | 16 schemas (2020-12) for capabilities, skills, bindings | JSON_SCHEMAS.md |
| Governance Catalog | Skill lifecycle: draft → validated → trusted → recommended | SKILL_GOVERNANCE_MANIFESTO.md |
| Binding Conformance | strict/standard/experimental profiles per binding |
CONSUMER_FACING_NEUTRAL_API.md |
| Binding Fallback | Deterministic fallback chain with terminal baseline | RUNNER_GUIDE.md |
Skill Authoring
# LLM-powered wizard — generates a complete skill from a plain-language goal
# Requires OPENAI_API_KEY (see .env.example)
export OPENAI_API_KEY=sk-...
agent-skills scaffold --wizard # LLM proposes workflow + YAML
agent-skills scaffold "Summarize a PDF" # one-line from intent
agent-skills scaffold --wizard --dry-run # preview without saving
# Without OPENAI_API_KEY, the wizard falls back to manual interactive mode
# (you pick inputs, outputs, and capabilities yourself)
agent-skills test text.summarize # auto-fixture tests
agent-skills describe text.summarize --mermaid # DAG diagram
agent-skills export text.summarize # portable bundle
agent-skills contribute text.summarize # promotion pipeline
See docs/SKILL_AUTHORING.md for the full workflow guide.
Documentation
| Topic | Link |
|---|---|
| 10-minute onboarding | ONBOARDING_10_MIN.md |
| Installation & setup | INSTALLATION.md |
| Environment variables | ENVIRONMENT_VARIABLES.md |
| Error taxonomy | ERROR_TAXONOMY.md |
| Runner architecture | RUNNER_GUIDE.md |
| DAG scheduler | SCHEDULER.md |
| Step control flow | STEP_CONTROL_FLOW.md |
| Streaming SSE | STREAMING.md |
| Async execution | ASYNC_EXECUTION.md |
| Deployment & Docker | DEPLOYMENT.md |
| Observability & OTel | OBSERVABILITY.md |
| Authentication & RBAC | AUTH.md |
| Security | SECURITY.md |
| OpenAPI foundation | OPENAPI_PHASE0_FOUNDATION.md |
| MCP integration | MCP_INTEGRATION_SLICES.md |
| Governance manifesto | SKILL_GOVERNANCE_MANIFESTO.md |
| Project status | PROJECT_STATUS.md |
Full documentation is served with MkDocs: make serve → http://localhost:8000
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
make check # lint + format + tests in one command
Troubleshooting
| Problem | Solution |
|---|---|
RegistryNotFound |
Run agent-skills doctor --fix to auto-clone the registry |
| Skill returns unexpected error | See Error Taxonomy for frozen error codes |
| Environment config issues | Check Environment Variables |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file orca_agent_skills-1.0.2.tar.gz.
File metadata
- Download URL: orca_agent_skills-1.0.2.tar.gz
- Upload date:
- Size: 514.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32132d787464070c468b6a34cbff3ccbe3ed8fe412108d4797b44454800ae291
|
|
| MD5 |
969098c0e463939e24baf77cacdc248e
|
|
| BLAKE2b-256 |
27409c1287bf76388f55375278137d300fdfd3c7122fd1832ffa3fcd6cac8928
|
Provenance
The following attestation bundles were made for orca_agent_skills-1.0.2.tar.gz:
Publisher:
publish.yml on gfernandf/agent-skills
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
orca_agent_skills-1.0.2.tar.gz -
Subject digest:
32132d787464070c468b6a34cbff3ccbe3ed8fe412108d4797b44454800ae291 - Sigstore transparency entry: 1394562323
- Sigstore integration time:
-
Permalink:
gfernandf/agent-skills@f230d26441257d5fb8385a4c6c01ab9b1adf821b -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/gfernandf
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f230d26441257d5fb8385a4c6c01ab9b1adf821b -
Trigger Event:
release
-
Statement type:
File details
Details for the file orca_agent_skills-1.0.2-py3-none-any.whl.
File metadata
- Download URL: orca_agent_skills-1.0.2-py3-none-any.whl
- Upload date:
- Size: 424.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b5ad8f9ba1e1c7d483f14978e4fff1a501e96fe16dab5ead2124575ade1bac2
|
|
| MD5 |
f4edddeb553a80d2a8f0ceee07c79854
|
|
| BLAKE2b-256 |
12a65b258b8b2eda1238eb4a9bab81dbdb7050b1f6a1240bb382cdec4a8b3edc
|
Provenance
The following attestation bundles were made for orca_agent_skills-1.0.2-py3-none-any.whl:
Publisher:
publish.yml on gfernandf/agent-skills
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
orca_agent_skills-1.0.2-py3-none-any.whl -
Subject digest:
5b5ad8f9ba1e1c7d483f14978e4fff1a501e96fe16dab5ead2124575ade1bac2 - Sigstore transparency entry: 1394562380
- Sigstore integration time:
-
Permalink:
gfernandf/agent-skills@f230d26441257d5fb8385a4c6c01ab9b1adf821b -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/gfernandf
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f230d26441257d5fb8385a4c6c01ab9b1adf821b -
Trigger Event:
release
-
Statement type: