A flexible Python package for creating AI agents with customizable tools

These details have not been verified by PyPI

Project links

Project description

agentu

Build AI agents that actually do things.

pip install agentu

Quick start

from agentu import Agent

def search_products(query: str) -> list:
    return db.products.search(query)

agent = Agent("sales").with_tools([search_products])

# Call a tool directly
result = await agent.call("search_products", {"query": "laptop"})

# Or let the LLM figure it out
result = await agent.infer("Find me laptops under $1500")

call() runs a tool. infer() lets the LLM pick the tool and fill in the parameters from natural language.

Declarative Configuration

You can also deploy agents natively from YAML or JSON configurations, mapping components like Models, System Prompts, Webhooks (via Apprise), Cache settings, and MCP tools with zero-code:

1. Create a bot.yaml (or .json)

name: "support-agent"
model: "openai/gpt-4o"
system_prompt: "You are an expert IT agent."
notify:
  - "discord://webhook/id"
cache:
  preset: "distributed"

2. Load dynamically

from agentu import Agent
import asyncio

async def main():
    agent = await Agent.from_config("bot.yaml")
    
    # Append local Python rules/tools if desired, then infer!
    agent.with_tools([resolve_ticket])
    await agent.infer("Help me reset my router")

asyncio.run(main())

(Requires pip install agentu[yaml] to load .yaml files. JSON loads natively without extra dependencies).

Workflows

Chain agents with >> (sequential) and & (parallel):

# One after another
workflow = researcher("Find AI trends") >> analyst("Analyze") >> writer("Summarize")

# All at once
workflow = search("AI") & search("ML") & search("Crypto")

# Parallel first, then merge
workflow = (search("AI") & search("ML")) >> analyst("Compare findings")

result = await workflow.run()

You can also pass data between steps with lambdas:

workflow = (
    researcher("Find companies")
    >> analyst(lambda prev: f"Extract top 5 from: {prev['result']}")
    >> writer(lambda prev: f"Write report about: {prev['companies']}")
)

Interrupted workflows can resume from the last successful step:

from agentu import resume_workflow

result = await workflow.run(checkpoint="./checkpoints", workflow_id="my-report")

# After a crash, pick up where you left off
await resume_workflow(result["checkpoint_path"])

Caching

Cache LLM responses to skip redundant API calls. Works with both plain strings and full conversations.

# Basic: memory + SQLite, 1-hour TTL
agent = Agent("assistant").with_cache()

# Same prompt, same response — no API call
await agent.infer("What is Python?")  # hits the LLM
await agent.infer("What is Python?")  # instant, from cache

Presets

# Exact match only (memory + SQLite)
agent.with_cache(preset="basic")

# Semantic matching — "vegan food" hits cache for "plant-based meals"
agent.with_cache(preset="smart", similarity_threshold=0.9)

# Offline-friendly with filesystem backup and background sync
agent.with_cache(preset="offline")

# Redis-backed for distributed setups
agent.with_cache(preset="distributed", redis_url="redis://localhost:6379")

Conversation caching

Full conversation lists cache the same way strings do — deterministic serialization, same hash, same hit:

conversation = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there!"},
    {"role": "user", "content": "What's the weather?"},
]
cache.set(conversation, "my-bot", "Looks sunny today.")
cache.get(conversation, "my-bot")  # → "Looks sunny today."

The second parameter is a namespace — any string that scopes the cache. Usually the model name, but it can be anything.

How matching works

Strategy	How	When
Exact	SHA-256 hash of prompt + namespace + temperature	Default, always runs first
Semantic	Cosine similarity of embedding vectors	`preset="smart"` or higher, runs on exact miss

Semantic matching uses an embedding model (local all-MiniLM-L6-v2 or API-based nomic-embed-text) and only returns a hit when similarity exceeds the threshold (default 0.95).

Memory

agent.remember("Customer prefers email", importance=0.9)
memories = agent.recall(query="communication preferences")

SQLite-backed, searchable, persistent across sessions.

Skills

Load domain expertise on demand, either from local paths or GitHub:

from agentu import Agent, Skill

# From GitHub (cached locally at ~/.agentu/skills/)
agent = Agent("assistant").with_skills([
    "hemanth/agentu-skills/pdf-processor",
    "openai/skills/code-review@v1.0",
])

# From local
agent = Agent("assistant").with_skills(["./skills/my-skill"])

# Or define inline
pdf_skill = Skill(
    name="pdf-processing",
    description="Extract text and tables from PDF files",
    instructions="skills/pdf/SKILL.md",
    resources={"forms": "skills/pdf/FORMS.md"}
)
agent = Agent("assistant").with_skills([pdf_skill])

Skills load progressively: metadata first (100 chars), then instructions (1500 chars), then resources only when needed.

Sessions

Stateful conversations with automatic context:

from agentu import SessionManager

manager = SessionManager()
session = manager.create_session(agent)

await session.send("What's the weather in SF?")
await session.send("What about tomorrow?")  # knows you mean SF

Multi-user isolation, SQLite persistence, session timeout handling.

Evaluation

Test your agents with simple assertions:

from agentu import evaluate

test_cases = [
    {"ask": "What's 5 + 3?", "expect": 8},
    {"ask": "Weather in SF?", "expect": "sunny"}
]

results = await evaluate(agent, test_cases)
print(f"Accuracy: {results.accuracy}%")
print(results.to_json())  # export for CI/CD

Matching strategies: exact, substring, LLM-as-judge, or custom validators.

Observability

All LLM calls and tool executions are tracked automatically:

from agentu import Agent, observe

observe.configure(output="console")  # or "json" or "silent"

agent = Agent("assistant").with_tools([...])
await agent.infer("Find me laptops")

metrics = agent.observer.get_metrics()
# {"tool_calls": 3, "total_duration_ms": 1240, "errors": 0}

Dashboard

from agentu import serve

serve(agent, port=8000)
# http://localhost:8000/dashboard — live metrics
# http://localhost:8000/docs — auto-generated API docs

Notifications

Send low-latency, non-blocking alerts to Slack, Discord, Email, or SMS when an agent finishes its task.

pip install agentu[notify]

from agentu import Agent

# Attach notification middleware via the builder pattern
agent = Agent("my-bot").with_notifier([
    "slack://bot-token/channel-id",
    "discord://webhook_id/webhook_token"
])

# The agent executes without blocking, and posts a rich summary containing tokens and elapsed ms.
await agent.infer("Audit the database schema")

Custom Formatting & Failure Alerts

Notifications trigger natively on Agent crashes too (e.g. rate limits). If you want to format exactly how the alert looks for successes or failures, provide a custom formatter:

from agentu.middleware import NotifyMiddleware

def custom_format(context, response, error) -> str:
    if error:
        return f"🚨 AGENT CRASH 🚨\n{error}"
    return f"✅ Agent {context.namespace} finished in {context.elapsed_ms}ms"

# Fall back to base use() method to pass the custom formatter
agent.use(NotifyMiddleware(
    targets=["slack://bot-token/channel-id"], 
    formatter=custom_format
))

Ralph mode

Run agents in autonomous loops with progress tracking:

result = await agent.ralph(
    prompt_file="PROMPT.md",
    max_iterations=50,
    timeout_minutes=30,
    on_iteration=lambda i, data: print(f"[{i}] {data['result'][:50]}...")
)

The agent loops until all checkpoints in PROMPT.md are complete or limits are reached.

Tool search

When you have hundreds of tools, you don't want them all in context. Deferred tools are discovered on-demand:

agent = Agent("payments").with_tools(defer=[charge_card, send_receipt, refund_payment])

# Agent calls search_tools("charge card") → finds charge_card → executes it
result = await agent.infer("charge $50 to card_123")

A search_tools function is auto-added. The agent searches, activates, and calls — all internally.

MCP

Connect to Model Context Protocol servers:

agent = await Agent("bot").with_mcp(["http://localhost:3000"])
agent = await Agent("bot").with_mcp([
    {"url": "https://api.com/mcp", "headers": {"Auth": "Bearer xyz"}}
])

LLM support

Works with any OpenAI-compatible API. Auto-detects available models from Ollama:

Agent("assistant")                                        # first available Ollama model
Agent("assistant", model="qwen3")                         # specific model
Agent("assistant", model="gpt-4", api_key="sk-...")       # OpenAI
Agent("assistant", model="mistral", api_base="http://localhost:8000/v1")  # vLLM, LM Studio, etc.

REST API

from agentu import serve

serve(agent, port=8000, enable_cors=True)

Endpoints: /execute, /process, /tools, /memory/remember, /memory/recall, /docs

API reference

# Agent
agent = Agent(name)                       # auto-detect model
agent = Agent(name, model="qwen3")        # explicit model
agent = Agent(name, max_turns=5)          # limit multi-turn cycles
agent.with_tools([func1, func2])          # active tools
agent.with_tools(defer=[many_funcs])      # searchable tools
agent.with_cache(preset="smart")          # caching
agent.with_skills(["github/repo/skill"])  # skills
agent.with_notifier(["slack://bot-token"])       # notifications
await agent.with_mcp([url])              # MCP servers

await agent.call("tool", params)          # direct tool execution
await agent.infer("natural language")     # LLM-routed execution

agent.remember(content, importance=0.8)   # store memory
agent.recall(query)                       # search memory

# Sessions
manager = SessionManager()
session = manager.create_session(agent)
await session.send("message")
session.get_history(limit=10)
session.clear_history()

# Evaluation
results = await evaluate(agent, test_cases)
results.accuracy     # 95.0
results.to_json()    # export

# Workflows
step1 >> step2          # sequential
step1 & step2           # parallel
await workflow.run()    # execute

Examples

git clone https://github.com/hemanth/agentu && cd agentu

python examples/basic.py                # simple agent
python examples/workflow.py             # workflows
python examples/memory.py               # memory system
python examples/example_sessions.py     # stateful sessions
python examples/example_eval.py         # agent evaluation
python examples/example_observe.py      # observability
python examples/api.py                  # REST API

Testing

pytest
pytest --cov=agentu

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.18.1

May 18, 2026

1.18.0

May 18, 2026

1.17.0

Apr 21, 2026

This version

1.16.0

Apr 13, 2026

1.15.0

Apr 13, 2026

1.14.0

Mar 25, 2026

1.13.0

Mar 24, 2026

1.12.1

Mar 24, 2026

1.12.0

Mar 24, 2026

1.11.0

Mar 13, 2026

1.10.0

Feb 6, 2026

1.9.0

Feb 6, 2026

1.8.3

Jan 24, 2026

1.8.2

Jan 21, 2026

1.8.1

Jan 19, 2026

1.8.0

Jan 17, 2026

1.7.1

Jan 15, 2026

1.7.0

Jan 15, 2026

1.6.1

Jan 9, 2026

1.6.0

Jan 9, 2026

1.5.1

Jan 2, 2026

1.2.2

Dec 20, 2025

1.2.1

Dec 20, 2025

1.1.0

Nov 25, 2025

1.0.1

Nov 10, 2025

1.0.0

Nov 10, 2025

0.3.0

Nov 3, 2025

0.2.0

Nov 3, 2025

0.1.0

Jan 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentu-1.16.0.tar.gz (370.3 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentu-1.16.0-py3-none-any.whl (88.7 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file agentu-1.16.0.tar.gz.

File metadata

Download URL: agentu-1.16.0.tar.gz
Upload date: Apr 13, 2026
Size: 370.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agentu-1.16.0.tar.gz
Algorithm	Hash digest
SHA256	`3f193ef11708ec01a6736885438784066c9b1b62ae988dae93e400c1b58d550d`
MD5	`d556094960b9463165c092d07d5bf973`
BLAKE2b-256	`ad2e7c98de6896bddb78c8201706a8b53814cfb03bec070011cd50c51844f56d`

See more details on using hashes here.

File details

Details for the file agentu-1.16.0-py3-none-any.whl.

File metadata

Download URL: agentu-1.16.0-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 88.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agentu-1.16.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fb73fdbc80a5c90d5af82fe3beabeb01681746942f70a1832540fc15062c6a54`
MD5	`7480d59f369a8f806184c41ea240f0f4`
BLAKE2b-256	`691942aa239fa0c3a948a07849d1982d89692fcc5aa3a6c84f9ff335a0e12daf`

See more details on using hashes here.

agentu 1.16.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

agentu

Quick start

Declarative Configuration

Workflows

Caching

Presets

Conversation caching

How matching works

Memory

Skills

Sessions

Evaluation

Observability

Dashboard

Notifications

Custom Formatting & Failure Alerts

Ralph mode

Tool search

MCP

LLM support

REST API

API reference

Examples

Testing

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes