A harness-engineered AI agent runtime with tool isolation, self-correction, and permission scoping

These details have not been verified by PyPI

Project links

Project description

agentu

A harness-engineered AI agent runtime. Build agents with tool isolation, self-correction, and permission scoping out of the box.

pip install agentu

Quick start

from agentu import Agent

def search_products(query: str) -> list:
    return db.products.search(query)

agent = Agent("sales").with_tools([search_products])

# Call a tool directly
result = await agent.call("search_products", {"query": "laptop"})

# Or let the LLM figure it out
result = await agent.infer("Find me laptops under $1500")

call() runs a tool. infer() lets the LLM pick the tool and fill in the parameters from natural language.

Sandboxed tool execution

Tools run in isolated subprocesses with timeouts and permission scoping. Separate what the agent can read from what it can write:

agent = Agent("assistant").with_sandbox(
    read_tools=[search, get_weather],
    write_tools=[save_file, send_email],
    timeout=10,
)

result = await agent.infer("Find the weather and save it to a file")

read_tools get READONLY permission, no side effects
write_tools get WRITE permission, side effects allowed
Every tool runs in a subprocess, not in your agent's process
If a tool hangs past timeout, subprocess is killed, agent stays alive
Sandbox exit codes, stderr, and timeouts are captured in the observer

Guardrails with self-correction

When output guardrails fail, the agent retries automatically by feeding the violation back to the LLM:

agent = Agent("assistant").with_guardrails(
    output_guardrails=[NoPII(), NoHallucination()],
    max_corrections=2,
)

result = await agent.infer("Summarize the customer data")
# If the LLM leaks PII, it retries up to 2 times with the violation as feedback

Rule files

Prepend project-level rules to every LLM call:

agent = Agent("assistant").with_rules("AGENTS.md")

The contents of AGENTS.md get prepended to the system prompt. Works with declarative config too:

name: "support-agent"
model: "openai/gpt-4o"
rules: "AGENTS.md"

Tool permissions

Three permission levels control what tools can do:

from agentu import Agent, Tool, ToolPermission

agent = Agent("bot").with_tools([
    Tool(search, permission=ToolPermission.READONLY),     # always allowed
    Tool(save_file, permission=ToolPermission.WRITE),     # allowed, logged
    Tool(delete_all, permission=ToolPermission.DANGEROUS), # blocked by default
])

# Explicitly allow DANGEROUS tools
agent.with_permissions(allow_dangerous=True)

Declarative configuration

Deploy agents from YAML or JSON with zero code:

1. Create a bot.yaml (or .json)

name: "support-agent"
model: "openai/gpt-4o"
system_prompt: "You are an expert IT agent."
rules: "AGENTS.md"
notify:
  - "discord://webhook/id"
cache:
  preset: "distributed"

2. Load dynamically

from agentu import Agent
import asyncio

async def main():
    agent = await Agent.from_config("bot.yaml")
    
    # Append local Python rules/tools if desired, then infer!
    agent.with_tools([resolve_ticket])
    await agent.infer("Help me reset my router")

asyncio.run(main())

(Requires pip install agentu[yaml] to load .yaml files. JSON loads natively without extra dependencies).

Workflows

Chain agents with >> (sequential) and & (parallel):

# One after another
workflow = researcher("Find AI trends") >> analyst("Analyze") >> writer("Summarize")

# All at once
workflow = search("AI") & search("ML") & search("Crypto")

# Parallel first, then merge
workflow = (search("AI") & search("ML")) >> analyst("Compare findings")

result = await workflow.run()

You can also pass data between steps with lambdas:

workflow = (
    researcher("Find companies")
    >> analyst(lambda prev: f"Extract top 5 from: {prev['result']}")
    >> writer(lambda prev: f"Write report about: {prev['companies']}")
)

Interrupted workflows can resume from the last successful step:

from agentu import resume_workflow

result = await workflow.run(checkpoint="./checkpoints", workflow_id="my-report")

# After a crash, pick up where you left off
await resume_workflow(result["checkpoint_path"])

Caching

Cache LLM responses to skip redundant API calls. Works with both plain strings and full conversations.

# Basic: memory + SQLite, 1-hour TTL
agent = Agent("assistant").with_cache()

# Same prompt, same response — no API call
await agent.infer("What is Python?")  # hits the LLM
await agent.infer("What is Python?")  # instant, from cache

Presets

# Exact match only (memory + SQLite)
agent.with_cache(preset="basic")

# Semantic matching — "vegan food" hits cache for "plant-based meals"
agent.with_cache(preset="smart", similarity_threshold=0.9)

# Offline-friendly with filesystem backup and background sync
agent.with_cache(preset="offline")

# Redis-backed for distributed setups
agent.with_cache(preset="distributed", redis_url="redis://localhost:6379")

Conversation caching

Full conversation lists cache the same way strings do -- deterministic serialization, same hash, same hit:

conversation = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there!"},
    {"role": "user", "content": "What's the weather?"},
]
cache.set(conversation, "my-bot", "Looks sunny today.")
cache.get(conversation, "my-bot")  # → "Looks sunny today."

The second parameter is a namespace -- any string that scopes the cache. Usually the model name, but it can be anything.

How matching works

Strategy	How	When
Exact	SHA-256 hash of prompt + namespace + temperature	Default, always runs first
Semantic	Cosine similarity of embedding vectors	`preset="smart"` or higher, runs on exact miss

Semantic matching uses an embedding model (local all-MiniLM-L6-v2 or API-based nomic-embed-text) and only returns a hit when similarity exceeds the threshold (default 0.95).

Memory

agent.remember("Customer prefers email", importance=0.9)
memories = agent.recall(query="communication preferences")

SQLite-backed, searchable, persistent across sessions.

Rationale Recording (ADRs)

Agents can explicitly record architectural decisions and the reasoning behind their actions, creating an automated audit trail.

agent = Agent("architect", enable_memory=True, enable_rationale_recording=True)

# The agent will automatically evaluate trade-offs and use the `record_rationale` tool
await agent.infer("Should we use threading or asyncio? Record your reasoning.")

# Retrieve the decision later
memories = agent.recall(query="asyncio")

Rationale events are simultaneously saved to memory (with memory_type="rationale") and emitted to the observability pipeline.

Skills

Load domain expertise on demand, either from local paths or GitHub:

from agentu import Agent, Skill

# From GitHub (cached locally at ~/.agentu/skills/)
agent = Agent("assistant").with_skills([
    "hemanth/agentu-skills/pdf-processor",
    "openai/skills/code-review@v1.0",
])

# From local
agent = Agent("assistant").with_skills(["./skills/my-skill"])

# Or define inline
pdf_skill = Skill(
    name="pdf-processing",
    description="Extract text and tables from PDF files",
    instructions="skills/pdf/SKILL.md",
    resources={"forms": "skills/pdf/FORMS.md"}
)
agent = Agent("assistant").with_skills([pdf_skill])

Skills load progressively: metadata first (100 chars), then instructions (1500 chars), then resources only when needed.

Sessions

Stateful conversations with automatic context:

from agentu import SessionManager

manager = SessionManager()
session = manager.create_session(agent)

await session.send("What's the weather in SF?")
await session.send("What about tomorrow?")  # knows you mean SF

Multi-user isolation, SQLite persistence, session timeout handling.

Evaluation

Test your agents with simple assertions:

from agentu import evaluate

test_cases = [
    {"ask": "What's 5 + 3?", "expect": 8},
    {"ask": "Weather in SF?", "expect": "sunny"}
]

results = await evaluate(agent, test_cases)
print(f"Accuracy: {results.accuracy}%")
print(results.to_json())  # export for CI/CD

Matching strategies: exact, substring, LLM-as-judge, or custom validators.

Observability

All LLM calls, tool executions, self-corrections, and sandbox events are tracked automatically:

from agentu import Agent, observe

observe.configure(output="console")  # or "json" or "silent"

agent = Agent("assistant").with_sandbox(
    read_tools=[search],
    write_tools=[save],
    timeout=10,
)
await agent.infer("Find me laptops")

metrics = agent.observer.get_metrics()
# {"tool_calls": 3, "total_duration_ms": 1240, "errors": 0}

Events captured: tool_call, tool_blocked, self_correction, llm_request, inference_start, inference_end, error, session_create, session_end.

Sandbox events include sandbox_exit_code, sandbox_stderr, and sandbox_timed_out for post-mortem debugging.

Dashboard

from agentu import serve

serve(agent, port=8000)
# http://localhost:8000/dashboard — live metrics
# http://localhost:8000/docs — auto-generated API docs

Notifications

Send low-latency, non-blocking alerts to Slack, Discord, Email, or SMS when an agent finishes its task.

pip install agentu[notify]

from agentu import Agent

# Attach notification middleware via the builder pattern
agent = Agent("my-bot").with_notifier([
    "slack://bot-token/channel-id",
    "discord://webhook_id/webhook_token"
])

# The agent executes without blocking, and posts a rich summary containing tokens and elapsed ms.
await agent.infer("Audit the database schema")

Custom Formatting & Failure Alerts

Notifications trigger natively on Agent crashes too (e.g. rate limits). If you want to format exactly how the alert looks for successes or failures, provide a custom formatter:

from agentu.middleware import NotifyMiddleware

def custom_format(context, response, error) -> str:
    if error:
        return f"🚨 AGENT CRASH 🚨\n{error}"
    return f"✅ Agent {context.namespace} finished in {context.elapsed_ms}ms"

# Fall back to base use() method to pass the custom formatter
agent.use(NotifyMiddleware(
    targets=["slack://bot-token/channel-id"], 
    formatter=custom_format
))

Ralph mode

Run agents in autonomous loops with progress tracking:

result = await agent.ralph(
    prompt_file="PROMPT.md",
    max_iterations=50,
    timeout_minutes=30,
    on_iteration=lambda i, data: print(f"[{i}] {data['result'][:50]}...")
)

The agent loops until all checkpoints in PROMPT.md are complete or limits are reached.

Tool search

When you have hundreds of tools, you don't want them all in context. Deferred tools are discovered on-demand:

agent = Agent("payments").with_tools(defer=[charge_card, send_receipt, refund_payment])

# Agent calls search_tools("charge card") → finds charge_card → executes it
result = await agent.infer("charge $50 to card_123")

A search_tools function is auto-added. The agent searches, activates, and calls -- all internally.

MCP

Connect to Model Context Protocol servers:

agent = await Agent("bot").with_mcp(["http://localhost:3000"])
agent = await Agent("bot").with_mcp([
    {"url": "https://api.com/mcp", "headers": {"Auth": "Bearer xyz"}}
])

LLM support

Works with any OpenAI-compatible API. Auto-detects available models from Ollama:

Agent("assistant")                                        # first available Ollama model
Agent("assistant", model="qwen3")                         # specific model
Agent("assistant", model="gpt-4", api_key="sk-...")       # OpenAI
Agent("assistant", model="mistral", api_base="http://localhost:8000/v1")  # vLLM, LM Studio, etc.

REST API

from agentu import serve

serve(agent, port=8000, enable_cors=True)

Endpoints: /execute, /process, /tools, /memory/remember, /memory/recall, /docs

API reference

# Agent
agent = Agent(name)                       # auto-detect model
agent = Agent(name, model="qwen3")        # explicit model
agent = Agent(name, max_turns=5)          # limit multi-turn cycles
agent.with_tools([func1, func2])          # active tools
agent.with_tools(defer=[many_funcs])      # searchable tools
agent.with_cache(preset="smart")          # caching
agent.with_skills(["github/repo/skill"])  # skills
agent.with_rules("AGENTS.md")            # project-level rules
agent.with_notifier(["slack://bot-token"])       # notifications
agent.with_permissions(allow_dangerous=True)     # permission control
await agent.with_mcp([url])              # MCP servers

# Sandbox
agent.with_sandbox(                       # tool isolation
    read_tools=[search, get_weather],
    write_tools=[save_file, send_email],
    timeout=10,
)

# Guardrails
agent.with_guardrails(                    # self-correction
    output_guardrails=[NoPII()],
    max_corrections=2,
)

await agent.call("tool", params)          # direct tool execution
await agent.infer("natural language")     # LLM-routed execution

agent.remember(content, importance=0.8)   # store memory
agent.recall(query)                       # search memory

# Sessions
manager = SessionManager()
session = manager.create_session(agent)
await session.send("message")
session.get_history(limit=10)
session.clear_history()

# Evaluation
results = await evaluate(agent, test_cases)
results.accuracy     # 95.0
results.to_json()    # export

# Workflows
step1 >> step2          # sequential
step1 & step2           # parallel
await workflow.run()    # execute

Examples

git clone https://github.com/hemanth/agentu && cd agentu

python examples/basic.py                # simple agent
python examples/workflow.py             # workflows
python examples/memory.py               # memory system
python examples/example_sessions.py     # stateful sessions
python examples/example_eval.py         # agent evaluation
python examples/example_observe.py      # observability
python examples/api.py                  # REST API

Testing

pytest
pytest --cov=agentu

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.18.1

May 18, 2026

1.18.0

May 18, 2026

1.17.0

Apr 21, 2026

1.16.0

Apr 13, 2026

1.15.0

Apr 13, 2026

1.14.0

Mar 25, 2026

1.13.0

Mar 24, 2026

1.12.1

Mar 24, 2026

1.12.0

Mar 24, 2026

1.11.0

Mar 13, 2026

1.10.0

Feb 6, 2026

1.9.0

Feb 6, 2026

1.8.3

Jan 24, 2026

1.8.2

Jan 21, 2026

1.8.1

Jan 19, 2026

1.8.0

Jan 17, 2026

1.7.1

Jan 15, 2026

1.7.0

Jan 15, 2026

1.6.1

Jan 9, 2026

1.6.0

Jan 9, 2026

1.5.1

Jan 2, 2026

1.2.2

Dec 20, 2025

1.2.1

Dec 20, 2025

1.1.0

Nov 25, 2025

1.0.1

Nov 10, 2025

1.0.0

Nov 10, 2025

0.3.0

Nov 3, 2025

0.2.0

Nov 3, 2025

0.1.0

Jan 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentu-1.18.1.tar.gz (383.1 kB view details)

Uploaded May 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentu-1.18.1-py3-none-any.whl (95.7 kB view details)

Uploaded May 18, 2026 Python 3

File details

Details for the file agentu-1.18.1.tar.gz.

File metadata

Download URL: agentu-1.18.1.tar.gz
Upload date: May 18, 2026
Size: 383.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agentu-1.18.1.tar.gz
Algorithm	Hash digest
SHA256	`f5bb80251d00b1d3fe8c81e9c1d78955b7d74440d1d3be9ffa98e997522ac09a`
MD5	`a9dab9f6cf557b26120db0e3b9be001b`
BLAKE2b-256	`d37004975d5a942a7262c1394ea6a14a1f780e0b54561c47ecb0e478e7c0e37a`

See more details on using hashes here.

File details

Details for the file agentu-1.18.1-py3-none-any.whl.

File metadata

Download URL: agentu-1.18.1-py3-none-any.whl
Upload date: May 18, 2026
Size: 95.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agentu-1.18.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d064aa0e925bd33b47acdacd808c5c6f0b5d4ba449ca9442598909d34796f4b5`
MD5	`2e9d403ff63e01db7cd5a660a416a777`
BLAKE2b-256	`5047975bb671fa833863897f808e1e1de08314b3092400641589e3ce997989d0`

See more details on using hashes here.

agentu 1.18.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

agentu

Quick start

Sandboxed tool execution

Guardrails with self-correction

Rule files

Tool permissions

Declarative configuration

Workflows

Caching

Presets

Conversation caching

How matching works

Memory

Rationale Recording (ADRs)

Skills

Sessions

Evaluation

Observability

Dashboard

Notifications

Custom Formatting & Failure Alerts

Ralph mode

Tool search

MCP

LLM support

REST API

API reference

Examples

Testing

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes