cave-agent

CaveAgent is a tool-augmented agent framework that enables function-calling through LLM code generation and provides runtime state management. Unlike traditional JSON-schema approaches, it natively handles complex Python objects like DataFrames and ndarrays within a persistent runtime, enabling lossless data flow across multi-turn interactions.

These details have not been verified by PyPI

Project links

Project description

CaveAgent

CaveAgent: Transforming LLMs into Stateful Runtime Operators

"From text-in-text-out to (text&object)-in-(text&object)-out"

Most LLM agents operate under a text-in-text-out paradigm, with tool interactions constrained to JSON primitives. CaveAgent breaks this with Stateful Runtime Management—a persistent Python runtime with direct variable injection and retrieval:

Inject any Python object into the runtime—DataFrames, models, database connections, custom class instances—as first-class variables the LLM can manipulate
Persist state across turns without serialization; objects live in the runtime, not in the context window
Retrieve manipulated objects back as native Python types for downstream

Installation
Hello World
Examples
Agent Skills
Features
Configuration
LLM Provider Support

Installation

pip install 'cave-agent[all]'

Choose your installation:

# OpenAI support
pip install 'cave-agent[openai]'

# 100+ LLM providers via LiteLLM
pip install 'cave-agent[litellm]'

Hello World

import asyncio
from cave_agent import CaveAgent
from cave_agent.runtime import PythonRuntime, Variable, Function
from cave_agent.models import LiteLLMModel

model = LiteLLMModel(model_id="model-id", api_key="your-api-key", custom_llm_provider="openai")

async def main():
    def reverse(s: str) -> str:
        """Reverse a string"""
        return s[::-1]

    runtime = PythonRuntime(
        variables=[
            Variable("secret", "!dlrow ,olleH", "A reversed message"),
            Variable("greeting", "", "Store the reversed message"),
        ],
        functions=[Function(reverse)],
    )
    agent = CaveAgent(model, runtime=runtime)
    response = await agent.run("Reverse the secret")
    print(runtime.retrieve("greeting"))  # Hello, world!
    print(response.content)              # Agent's text response

asyncio.run(main())

Examples

Function Calling

# Inject functions and variables into runtime
runtime = PythonRuntime(
    variables=[Variable("tasks", [], "User's task list")],
    functions=[Function(add_task), Function(complete_task)],
)
agent = CaveAgent(model, runtime=runtime)

await agent.run("Add 'buy groceries' to my tasks")
print(runtime.retrieve("tasks"))  # [{'name': 'buy groceries', 'done': False}]

See examples/basic_usage.py for a complete example.

Stateful Object Interactions

# Inject objects with methods - LLM can call them directly
runtime = PythonRuntime(
    types=[Type(Light), Type(Thermostat)],
    variables=[
        Variable("light", Light("Living Room"), "Smart light"),
        Variable("thermostat", Thermostat(), "Home thermostat"),
    ],
)
agent = CaveAgent(model, runtime=runtime)

await agent.run("Dim the light to 20% and set thermostat to 22°C")
light = runtime.retrieve("light")  # Object with updated state

See examples/object_methods.py for a complete example.

Multi-Agent Coordination

# Sub-agents with their own runtimes
cleaner_agent = CaveAgent(model, runtime=PythonRuntime(variables=[
    Variable("data", [], "Input"), Variable("cleaned_data", [], "Output"),
]))

analyzer_agent = CaveAgent(model, runtime=PythonRuntime(variables=[
    Variable("data", [], "Input"), Variable("insights", {}, "Output"),
]))

# Orchestrator controls sub-agents as first-class objects
orchestrator = CaveAgent(model, runtime=PythonRuntime(variables=[
    Variable("raw_data", raw_data, "Raw dataset"),
    Variable("cleaner", cleaner_agent, "Cleaner agent"),
    Variable("analyzer", analyzer_agent, "Analyzer agent"),
]))

# Inject → trigger → retrieve
await orchestrator.run("Clean raw_data using cleaner, then analyze using analyzer")
insights = analyzer.runtime.retrieve("insights")

See examples/multi_agent.py for a complete example.

Real-time Streaming

async for event in agent.stream_events("Analyze this data"):
    if event.type.value == 'code':
        print(f"Executing: {event.content}")
    elif event.type.value == 'execution_output':
        print(f"Result: {event.content}")

See examples/stream.py for a complete example.

Security Rules

# Block dangerous operations with AST-based validation
rules = [
    ImportRule({"os", "subprocess", "sys"}),
    FunctionRule({"eval", "exec", "open"}),
    AttributeRule({"__globals__", "__builtins__"}),
]
runtime = PythonRuntime(security_checker=SecurityChecker(rules))

More Examples

Basic Usage: Function calling and object processing
Runtime State: State management across interactions
Object Methods: Class methods and complex objects
Multi-Turn: Conversations with state persistence
Multi-Agent: Data pipeline with multiple agents
Stream: Streaming responses and events

Agent Skills

CaveAgent implements the Agent Skills open standard—a portable format for packaging instructions that agents can discover and use. Originally developed by Anthropic and now supported across the AI ecosystem (Claude, Gemini CLI, Cursor, VS Code, and more), Skills enable agents to acquire domain expertise on-demand.

Creating a Skill

A Skill is a directory containing a SKILL.md file with YAML frontmatter:

my-skill/
├── SKILL.md           # Required: Skill definition and instructions
└── injection.py       # Optional: Functions/variables/types to inject (CaveAgent extension)

SKILL.md structure:

---
name: data-processor
description: Process and analyze datasets with statistical methods. Use when working with data analysis tasks.
---

# Data Processing Instructions

## Quick Start
Use the injected functions to analyze datasets...

## Workflows
1. Activate the skill to load injected functions
2. Apply statistical analysis using the provided functions
3. Return structured results

Required fields: name (max 64 chars, lowercase with hyphens) and description (max 1024 chars)

Optional fields: license, compatibility, metadata

How Skills Load (Progressive Disclosure)

Skills use progressive disclosure to minimize context usage:

Level	When Loaded	Content
Metadata	At startup	`name` and `description` from YAML frontmatter (~100 tokens)
Instructions	When activated	SKILL.md body with guidance (loaded on-demand)

Using Skills

from cave_agent import CaveAgent, Skill
from cave_agent.skills import SkillDiscovery
from cave_agent.runtime import Function, Variable

# Create skills directly
skill = Skill(
    name="my-skill",
    description="A custom skill",
    body_content="# Instructions\nFollow these steps...",
    functions=[Function(my_func)],
    variables=[Variable("config", value={})],
)
agent = CaveAgent(model=model, skills=[skill])

# Or load from files
skill = SkillDiscovery.from_file("./my-skill/SKILL.md")
agent = CaveAgent(model=model, skills=[skill])

# Or load from directory
skills = SkillDiscovery.from_directory("./skills")
agent = CaveAgent(model=model, skills=skills)

When skills are loaded, the agent gains access to the activate_skill(skill_name) runtime function to activate a skill and load its instructions.

Injection Module (CaveAgent Extension)

CaveAgent extends the Agent Skills standard with injection.py—allowing skills to inject functions, variables, and types directly into the runtime when activated:

from cave_agent.runtime import Function, Variable, Type
from dataclasses import dataclass

def analyze_data(data: list) -> dict:
    """Analyze data and return statistics."""
    return {"mean": sum(data) / len(data), "count": len(data)}

@dataclass
class AnalysisResult:
    mean: float
    count: int
    status: str

CONFIG = {"threshold": 0.5, "max_items": 1000}

__exports__ = [
    Function(analyze_data, description="Analyze data statistically"),
    Variable("CONFIG", value=CONFIG, description="Analysis configuration"),
    Type(AnalysisResult, description="Result structure"),
]

When activate_skill() is called, these exports are automatically injected into the runtime namespace.

See examples/skill_data_processor.py for a complete example.

Features

Code-Based Function Calling: Leverages LLM's natural coding abilities instead of rigid JSON schemas
Secure Runtime Environment:
- Inject Python objects, variables, and functions as tools
- Rule-based security validation prevents dangerous code execution
- Flexible security rules: ImportRule, FunctionRule, AttributeRule, RegexRule
- Customizable security policies for different use cases
- Access execution results and maintain state across interactions
Agent Skills: Implements the open Agent Skills standard for modular, portable instruction packages. CaveAgent extends the standard with runtime injection (injection.py).
Multi-Agent Coordination: Control sub-agents programmatically through runtime injection and retrieval. Shared runtimes enable instant state synchronization.
Streaming & Async: Real-time event streaming and full async/await support for optimal performance
Execution Control: Configurable step limits and error handling to prevent infinite loops
Flexible LLM Support: Works with any LLM provider via OpenAI-compatible APIs or LiteLLM
Type Injection: Expose class schemas for type-aware LLM code generation

Awesome Blogs

We thank these community to post our work.

Configuration

Parameter	Type	Default	Description
model	Model	required	LLM model instance (OpenAIServerModel or LiteLLMModel)
runtime	PythonRuntime	None	Python runtime with variables, functions, and types
skills	List[Skill]	None	List of skill objects to load
max_steps	int	5	Maximum execution steps per run
max_history	int	10	Maximum conversation history length
max_exec_output	int	5000	Max characters in execution output
instructions	str	default	User instructions defining agent role and behavior
system_instructions	str	default	System-level execution rules and examples
system_prompt_template	str	default	Custom system prompt template
python_block_identifier	str	python	Code block language identifier
messages	List[Message]	None	Initial message history
log_level	LogLevel	DEBUG	Logging verbosity level

LLM Provider Support

CaveAgent supports multiple LLM providers:

OpenAI-Compatible Models

from cave_agent.models import OpenAIServerModel

model = OpenAIServerModel(
    model_id="gpt-4",
    api_key="your-api-key",
    base_url="https://api.openai.com/v1"  # or your custom endpoint
)

LiteLLM Models (Recommended)

LiteLLM provides unified access to hundreds of LLM providers:

from cave_agent.models import LiteLLMModel

# OpenAI
model = LiteLLMModel(
    model_id="gpt-4",
    api_key="your-api-key",
    custom_llm_provider='openai'
)

# Anthropic Claude
model = LiteLLMModel(
    model_id="claude-3-sonnet-20240229",
    api_key="your-api-key",
    custom_llm_provider='anthropic' 
)

# Google Gemini
model = LiteLLMModel(
    model_id="gemini/gemini-pro",
    api_key="your-api-key"
)

Contributing

Contributions are welcome! Please feel free to submit a PR. For more details, see CONTRIBUTING.md.

Citation

If you use CaveAgent in your research, please cite:

@article{ran2026caveagent,
  title={CaveAgent: Transforming LLMs into Stateful Runtime Operators},
  author={Ran, Maohao and Wan, Zhenglin and Lin, Cooper and Zhang, Yanting and others},
  journal={arXiv preprint arXiv:2601.01569},
  year={2026}
}

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.3

Apr 8, 2026

0.7.2

Apr 7, 2026

0.7.1

Mar 27, 2026

0.7.0

Mar 18, 2026

This version

0.6.5

Jan 26, 2026

0.6.4

Jan 20, 2026

0.6.3

Jan 19, 2026

0.6.2

Jan 8, 2026

0.6.1

Dec 12, 2025

0.6.0

Dec 3, 2025

0.5.0

Oct 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cave_agent-0.6.5.tar.gz (740.7 kB view details)

Uploaded Jan 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cave_agent-0.6.5-py3-none-any.whl (37.8 kB view details)

Uploaded Jan 26, 2026 Python 3

File details

Details for the file cave_agent-0.6.5.tar.gz.

File metadata

Download URL: cave_agent-0.6.5.tar.gz
Upload date: Jan 26, 2026
Size: 740.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cave_agent-0.6.5.tar.gz
Algorithm	Hash digest
SHA256	`ee321988616d9915909806228b1a7f75f8644fd9f5fb87452f6473270d34eeae`
MD5	`d358339c5b22f5250ae13f5b4f821a39`
BLAKE2b-256	`8fd131922d3528ced8d15fccbc0bce7d87a7ff8fda470569015b2000632a8d57`

See more details on using hashes here.

File details

Details for the file cave_agent-0.6.5-py3-none-any.whl.

File metadata

Download URL: cave_agent-0.6.5-py3-none-any.whl
Upload date: Jan 26, 2026
Size: 37.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cave_agent-0.6.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`35244e5e32a3a85a5b143fff4f1eef7cec4e0ff23c5766f0190fd1257a60869a`
MD5	`eab83cec7437f17658d010805f599031`
BLAKE2b-256	`2c4bf43e4bc1e37a402cdc61520998fb1c19e1084f7ee16f389981422c5d4906`

See more details on using hashes here.

cave-agent 0.6.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CaveAgent: Transforming LLMs into Stateful Runtime Operators

Table of Contents

Installation

Hello World

Examples

Function Calling

Stateful Object Interactions

Multi-Agent Coordination

Real-time Streaming

Security Rules

More Examples

Agent Skills

Creating a Skill

How Skills Load (Progressive Disclosure)

Using Skills

Injection Module (CaveAgent Extension)

Features

Awesome Blogs

Configuration

LLM Provider Support

OpenAI-Compatible Models

LiteLLM Models (Recommended)

Contributing

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes