Skip to main content

Runtime for executing reusable AI agent skills and capability bindings

Project description

Agent Skills

Agent Skills Runtime

Agents should execute whenever possible.

PyPI License CI Python Tests Capabilities Skills

A deterministic, binding-driven execution engine for composable AI agent skills.

Agent Skills Runtime lets you define agent capabilities as abstract contracts, wire them to any backend (Python, OpenAPI, MCP, OpenRPC), and execute multi-step workflows as declarative DAGs โ€” with built-in safety gates, cognitive state tracking, and full observability.

No API keys required. 122 capabilities ship with deterministic Python baselines. Install, run your first skill in under 3 minutes.


๐Ÿงฉ Mental Model

Think of Agent Skills as:

  • APIs โ†’ turned into reusable โ€œcapabilitiesโ€
  • Workflows (Zapier / Airflow) โ†’ turned into โ€œskillsโ€
  • Agent reasoning โ†’ made explicit via structured state

In short:

Agent Skills lets agents execute structured workflows over tools, instead of guessing what to do via prompts.


Table of Contents


๐Ÿง  Introducing ORCA

Agent Skills Runtime is a reference implementation of ORCA โ€” an emerging standard for structured agent execution.

ORCA (Open Cognitive Runtime Architecture) defines a Cognitive Execution Layer where agents do not act through prompts, but through composable, contract-driven processes.

Unlike traditional agent frameworks that rely on implicit reasoning inside LLMs, ORCA externalizes cognition into:

  • Structured state (CognitiveState) โ€” explicit, inspectable reasoning
  • Capabilities (contracts) โ€” reusable, binding-agnostic operations
  • Skills (execution graphs) โ€” deterministic, composable workflows
  • Built-in safety โ€” enforced execution constraints and validation

This shifts agent systems from:

  • prompt-driven behavior
    โ†’ to
  • execution-driven systems

๐Ÿ”น Core Principles of ORCA

  • Execution over prompting
  • Explicit state over implicit context
  • Contracts over conventions
  • Separation of intent and execution
  • Safety as a first-class concern

๐Ÿ”น Learn more

See the full ORCA specification:
๐Ÿ‘‰ ORCA.md


Why Agent Skills?

Problem How Agent Skills solves it
Tools are coupled to one framework Binding abstraction โ€” same capability, 4 protocols (PythonCall, OpenAPI, MCP, OpenRPC)
Workflows are imperative code Declarative YAML skills โ€” steps, dependencies, mappings resolved by the runtime
No safety model 4-tier safety gates โ€” trust levels, confirmation prompts, scope constraints, side-effect tracking
No structured reasoning state CognitiveState v1 โ€” typed Frame/Working/Output/Trace aligned with CoALA
Inconsistent naming Controlled vocabulary โ€” 122 capabilities across 27 domains with governed naming
Hard to observe OTel + metrics + audit โ€” hash-chain audit trail, Prometheus metrics, SSE streaming

๐Ÿค” When should you use Agent Skills?

Use it if you need:

  • Deterministic and reproducible agent behavior
  • Safe interaction with real systems (APIs, databases, etc.)
  • Reusable workflows instead of prompt engineering
  • Observability and auditability of agent execution

Avoid it if:

  • You just need a quick prompt-based prototype
  • You donโ€™t need control over execution or safety

Architecture

Note: The diagram below uses Mermaid. It renders natively on GitHub. If viewing on PyPI or another platform, see the architecture diagram on GitHub.

graph TB
    subgraph "Developer Interface"
        CLI["CLI<br/>agent-skills run / describe / scaffold / test"]
        HTTP["HTTP API<br/>REST + SSE streaming"]
        SDK["SDKs<br/>Python ยท TypeScript ยท LangChain ยท CrewAI ยท AutoGen ยท SemanticKernel"]
        MCP_SERVER["MCP Server<br/>stdio + SSE transport"]
        LLM_NATIVE["Native LLM Adapters<br/>Anthropic ยท OpenAI ยท Gemini"]
    end

    subgraph "Gateway Layer"
        GW["Skill Gateway<br/>Discovery ยท Ranking ยท Governance"]
    end

    subgraph "Execution Engine"
        SCHED["DAG Scheduler<br/>Kahn's topological sort ยท parallel / sequential"]
        POLICY["Policy Engine<br/>Safety gates ยท Trust levels ยท Confirmation"]
        COGSTATE["CognitiveState v1<br/>Frame ยท Working ยท Output ยท Trace"]
    end

    subgraph "Binding Layer"
        BR["Binding Resolver<br/>Protocol routing ยท Fallback chain ยท Conformance"]
        PC["PythonCall"]
        OA["OpenAPI"]
        MCP_P["MCP"]
        RPC["OpenRPC"]
    end

    subgraph "Services"
        BASELINE["Python Baselines<br/>122 deterministic functions"]
        EXTERNAL["External APIs<br/>OpenAI ยท custom services"]
        MCP_SRV["MCP Servers<br/>In-process + subprocess"]
    end

    CLI --> GW
    HTTP --> GW
    SDK --> HTTP
    MCP_SERVER --> GW
    LLM_NATIVE --> GW
    GW --> SCHED
    SCHED --> POLICY
    POLICY --> BR
    SCHED -.-> COGSTATE
    BR --> PC --> BASELINE
    BR --> OA --> EXTERNAL
    BR --> MCP_P --> MCP_SRV
    BR --> RPC --> EXTERNAL

How it compares

Dimension Agent Skills LangGraph SemanticKernel OpenAI SDK CrewAI
DAG Execution โœ… Kahn sort โœ… StateGraph โš ๏ธ Linear โš ๏ธ Tool-loop โš ๏ธ Sequential
Multi-Protocol Bindings โœ… 4 protocols โŒ Python only โš ๏ธ HTTP+plugins โŒ Function only โŒ Function only
Safety Model โœ… 4-tier gates โŒ None โš ๏ธ Basic โŒ Minimal โŒ None
Cognitive State โœ… Typed (CoALA) โŒ No formal โŒ No formal โŒ No formal โš ๏ธ Roles
Capability Registry โœ… 122 governed โŒ None โš ๏ธ Plugin store โŒ None โš ๏ธ Templates
Observability โœ… OTel+Metrics+Audit โœ… LangSmith โš ๏ธ AppInsights โš ๏ธ Log-only โš ๏ธ Basic
Zero-config local run โœ… Python baselines โš ๏ธ Needs LLM key โš ๏ธ Needs Azure โŒ Needs API key โš ๏ธ Needs LLM key
Declarative workflows โœ… YAML skills โš ๏ธ Python code โš ๏ธ C# code โŒ Imperative โš ๏ธ Python code
Checkpoint/Restore โœ… Full state โœ… Checkpoints โœ… State โŒ Stateless โš ๏ธ Memory

Quick Start

Install from PyPI

pip install orca-agent-skills          # core
pip install orca-agent-skills[all]     # + PDF, web, OTel extras
pip install orca-agent-skills[mcp]     # + MCP server/client
pip install orca-agent-skills[dev]     # + pytest, ruff, benchmarks

The PyPI package includes the execution engine and CLI. You'll also need the companion agent-skill-registry (capability contracts, skills, vocabulary):

git clone https://github.com/gfernandf/agent-skill-registry.git
agent-skills doctor   # verifies registry is found

Install from source

git clone https://github.com/gfernandf/agent-skills.git
cd agent-skills
make bootstrap       # clones registry alongside, installs deps
agent-skills doctor   # all checks should pass

What make bootstrap does: clones the registry into ../agent-skill-registry/, then runs pip install -e ".[all,dev]". If you prefer manual setup, see docs/INSTALLATION.md.

Run your first skill

Agent Skills CLI demo โ€” doctor + summarize

agent-skills run text.summarize-plain-input \
  --input '{"text": "Agent Skills Runtime is a deterministic execution engine for composable AI agent skills. It supports four binding protocols and ships with 122 Python baselines.", "max_length": 50}'

Expected output:

{
  "summary": "Agent Skills Runtime is a deterministic execution engine...",
  "sentiment": "positive"
}

Run via HTTP

agent-skills serve                     # starts server on :8080
curl http://localhost:8080/v1/health   # health check
curl -X POST http://localhost:8080/v1/skills/text.summarize-plain-input/execute \
  -H "Content-Type: application/json" \
  -d '{"inputs": {"text": "Hello world", "max_length": 20}}'

Baseline โ†’ LLM: same skill, two modes

Every capability ships with a deterministic Python baseline. Set OPENAI_API_KEY to upgrade to LLM-powered execution โ€” zero code changes.

# 1. Baseline mode (no API key, pure Python)
agent-skills run text.summarize-plain-input \
  --input '{"text": "Agent Skills decouples capability contracts from execution backends.", "max_length": 30}'
# โ†’ {"summary": "Agent Skills decouples capability contracts from exec..."}

# 2. LLM mode (set key, same command)
export OPENAI_API_KEY=sk-...
agent-skills run text.summarize-plain-input \
  --input '{"text": "Agent Skills decouples capability contracts from execution backends.", "max_length": 30}'
# โ†’ {"summary": "Agent Skills separates capability definitions from their runtime implementations."}

The binding resolver picks the best available backend automatically:

  • No key โ†’ PythonCall baseline (deterministic, offline, fast)
  • Key set โ†’ OpenAPI binding to OpenAI (richer output, higher latency)

This means your CI stays green without API keys, and production gets LLM quality โ€” from the same skill YAML.

See docs/INSTALLATION.md for full setup instructions, optional extras, and environment variable reference.

Use with LangChain / LangGraph

from sdk.embedded import as_langchain_tools

# No server needed โ€” runs in-process
tools = as_langchain_tools(["text.content.summarize", "text.content.translate"])
# Pass tools to any LangChain AgentExecutor or LangGraph node
agent = create_react_agent(llm, tools)

Adapters are also available for CrewAI, AutoGen, and Semantic Kernel โ€” see the sdk/ directory.

Use as MCP Server

Expose all 122 capabilities as MCP tools over stdio (or SSE) โ€” any MCP-compatible host (Claude Desktop, VS Code Copilot, etc.) can discover and call them:

# stdio transport (default โ€” for Claude Desktop / MCP hosts)
python -m official_mcp_servers

# SSE transport (for network clients)
python -m official_mcp_servers --sse --host 0.0.0.0 --port 8765

# Or via the CLI
agent-skills mcp-serve
agent-skills mcp-serve --sse --port 8765

Requires the mcp extra: pip install -e ".[mcp]"

Native LLM Tool Definitions

Generate provider-native tool arrays for Anthropic, OpenAI, and Gemini โ€” no HTTP server, no adapters, just the format each SDK expects:

from sdk.embedded import (
    as_anthropic_tools, execute_anthropic_tool_call,
    as_openai_tools,    execute_openai_tool_call,
    as_gemini_tools,    execute_gemini_tool_call,
)

# โ”€โ”€ Anthropic โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
tools = as_anthropic_tools(["text.content.summarize"])
response = client.messages.create(model="claude-sonnet-4-20250514", tools=tools, ...)
result = execute_anthropic_tool_call(block.name, block.input)

# โ”€โ”€ OpenAI โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
tools = as_openai_tools()  # all 122 capabilities
response = openai.chat.completions.create(model="gpt-4o", tools=tools, ...)
result = execute_openai_tool_call(call.function.name, call.function.arguments)

# โ”€โ”€ Gemini โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
tools = as_gemini_tools(["data.schema.validate"])
response = model.generate_content(contents, tools=tools)
result = execute_gemini_tool_call(fc.name, fc.args)

Each execute_* helper maps the underscore tool name back to the dotted capability ID and returns a JSON string.

Choosing your integration mode

Mode Best for Latency Requires server?
Embedded SDK (sdk.embedded) Python apps, notebooks, scripts Lowest (in-process) No
Native LLM tools (as_anthropic_tools, etc.) Direct Anthropic/OpenAI/Gemini integration Low (in-process) No
LangChain / CrewAI / AutoGen (as_langchain_tools, etc.) Framework-based agents Low (in-process) No
MCP Server (python -m official_mcp_servers) Claude Desktop, VS Code Copilot, MCP hosts Low (stdio/SSE) MCP host
HTTP REST (agent-skills serve) Microservices, non-Python clients, multi-tenant Medium (network) Yes

Start here: If you're writing Python, use the embedded SDK โ€” zero setup, lowest latency. Switch to HTTP only when you need network access or non-Python clients.

License

Apache 2.0 โ€” see LICENSE.

Citing

If you use Agent Skills in your research, please cite:

@software{fernandez_agent_skills_2026,
  author       = {Fernandez Alvarez, Guillermo},
  title        = {Agent Skills Runtime},
  year         = {2026},
  url          = {https://github.com/gfernandf/agent-skills},
  version      = {0.1.0},
  license      = {Apache-2.0}
}

GitHub also provides a "Cite this repository" button powered by CITATION.cff.


Advanced Features

Feature Description Docs
NL Autopilot agent-skills ask "summarize this" โ€” discovers, maps, executes SKILL_AUTHORING.md
Dev Watch Hot-reload skill development with agent-skills dev SKILL_AUTHORING.md
Skill Triggers Declarative webhook / event / file-change triggers WEBHOOKS.md
Benchmark Lab Compare binding protocols side-by-side CLI: agent-skills benchmark-lab
Compose DSL Compact .compose text syntax for workflows CLI: agent-skills compose
Showcase One-command shareable markdown for any skill CLI: agent-skills showcase
Local Capabilities Custom capabilities via .agent-skills/capabilities/ with extends SKILL_AUTHORING.md
Auth & RBAC 4 hierarchical roles, API key + JWT, pluggable AUTH.md
Webhooks HMAC-signed event payloads with auto-retry WEBHOOKS.md
Plugin System Entry-point based auth, invoker, and binding-source plugins PLUGINS.md
Audit Trail Hash-chain audit with off/standard/full modes OBSERVABILITY.md
CognitiveState v1 Typed Frame/Working/Output/Trace aligned with CoALA COGNITIVE_STATE_V1.md
JSON Schemas 16 schemas (2020-12) for capabilities, skills, bindings JSON_SCHEMAS.md
Governance Catalog Skill lifecycle: draft โ†’ validated โ†’ trusted โ†’ recommended SKILL_GOVERNANCE_MANIFESTO.md
Binding Conformance strict/standard/experimental profiles per binding CONSUMER_FACING_NEUTRAL_API.md
Binding Fallback Deterministic fallback chain with terminal baseline RUNNER_GUIDE.md

Skill Authoring

# LLM-powered wizard โ€” generates a complete skill from a plain-language goal
# Requires OPENAI_API_KEY (see .env.example)
export OPENAI_API_KEY=sk-...
agent-skills scaffold --wizard               # LLM proposes workflow + YAML
agent-skills scaffold "Summarize a PDF"      # one-line from intent
agent-skills scaffold --wizard --dry-run     # preview without saving

# Without OPENAI_API_KEY, the wizard falls back to manual interactive mode
# (you pick inputs, outputs, and capabilities yourself)

agent-skills test text.summarize              # auto-fixture tests
agent-skills describe text.summarize --mermaid  # DAG diagram
agent-skills export text.summarize            # portable bundle
agent-skills contribute text.summarize        # promotion pipeline

See docs/SKILL_AUTHORING.md for the full workflow guide.

Documentation

Topic Link
10-minute onboarding ONBOARDING_10_MIN.md
Installation & setup INSTALLATION.md
Environment variables ENVIRONMENT_VARIABLES.md
Error taxonomy ERROR_TAXONOMY.md
Runner architecture RUNNER_GUIDE.md
DAG scheduler SCHEDULER.md
Step control flow STEP_CONTROL_FLOW.md
Streaming SSE STREAMING.md
Async execution ASYNC_EXECUTION.md
Deployment & Docker DEPLOYMENT.md
Observability & OTel OBSERVABILITY.md
Authentication & RBAC AUTH.md
Security SECURITY.md
OpenAPI foundation OPENAPI_PHASE0_FOUNDATION.md
MCP integration MCP_INTEGRATION_SLICES.md
Governance manifesto SKILL_GOVERNANCE_MANIFESTO.md
Project status PROJECT_STATUS.md

Full documentation is served with MkDocs: make serve โ†’ http://localhost:8000

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

make check   # lint + format + tests in one command

Troubleshooting

Problem Solution
RegistryNotFound Run agent-skills doctor --fix to auto-clone the registry
Skill returns unexpected error See Error Taxonomy for frozen error codes
Environment config issues Check Environment Variables

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orca_agent_skills-0.1.0.tar.gz (454.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orca_agent_skills-0.1.0-py3-none-any.whl (333.2 kB view details)

Uploaded Python 3

File details

Details for the file orca_agent_skills-0.1.0.tar.gz.

File metadata

  • Download URL: orca_agent_skills-0.1.0.tar.gz
  • Upload date:
  • Size: 454.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for orca_agent_skills-0.1.0.tar.gz
Algorithm Hash digest
SHA256 14c0438400bbc349fecc13ed10fdaf6b37d5d08e441c2f71060650bdd2f41653
MD5 65aa8129edec3270ee658e1d4ff492ba
BLAKE2b-256 9975d4c734bf312e425307d7553a7740fab4a2df57accaadc436994c937a2e51

See more details on using hashes here.

File details

Details for the file orca_agent_skills-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for orca_agent_skills-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 73104188748d58eeb9c06edadd808bfcf7570081e794f6fde5753afbc447bc1c
MD5 53dd552b6c6a5e84c97faf6ee57101d0
BLAKE2b-256 cf2cc3eaf6928bd2f2902bd07387148ca899b0d154f00af77f70d978b0438221

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page