Skip to main content

A flexible Python package for creating AI agents with customizable tools

Project description

agentu

Build AI agents that actually do things.

pip install agentu

Quick start

from agentu import Agent

def search_products(query: str) -> list:
    return db.products.search(query)

agent = Agent("sales").with_tools([search_products])

# Call a tool directly
result = await agent.call("search_products", {"query": "laptop"})

# Or let the LLM figure it out
result = await agent.infer("Find me laptops under $1500")

call() runs a tool. infer() lets the LLM pick the tool and fill in the parameters from natural language.

Declarative Configuration

You can also deploy agents natively from YAML or JSON configurations, mapping components like Models, System Prompts, Webhooks (via Apprise), Cache settings, and MCP tools with zero-code:

1. Create a bot.yaml (or .json)

name: "support-agent"
model: "openai/gpt-4o"
system_prompt: "You are an expert IT agent."
notify:
  - "discord://webhook/id"
cache:
  preset: "distributed"

2. Load dynamically

from agentu import Agent
import asyncio

async def main():
    agent = await Agent.from_config("bot.yaml")
    
    # Append local Python rules/tools if desired, then infer!
    agent.with_tools([resolve_ticket])
    await agent.infer("Help me reset my router")

asyncio.run(main())

(Requires pip install agentu[yaml] to load .yaml files. JSON loads natively without extra dependencies).

Workflows

Chain agents with >> (sequential) and & (parallel):

# One after another
workflow = researcher("Find AI trends") >> analyst("Analyze") >> writer("Summarize")

# All at once
workflow = search("AI") & search("ML") & search("Crypto")

# Parallel first, then merge
workflow = (search("AI") & search("ML")) >> analyst("Compare findings")

result = await workflow.run()

You can also pass data between steps with lambdas:

workflow = (
    researcher("Find companies")
    >> analyst(lambda prev: f"Extract top 5 from: {prev['result']}")
    >> writer(lambda prev: f"Write report about: {prev['companies']}")
)

Interrupted workflows can resume from the last successful step:

from agentu import resume_workflow

result = await workflow.run(checkpoint="./checkpoints", workflow_id="my-report")

# After a crash, pick up where you left off
await resume_workflow(result["checkpoint_path"])

Caching

Cache LLM responses to skip redundant API calls. Works with both plain strings and full conversations.

# Basic: memory + SQLite, 1-hour TTL
agent = Agent("assistant").with_cache()

# Same prompt, same response — no API call
await agent.infer("What is Python?")  # hits the LLM
await agent.infer("What is Python?")  # instant, from cache

Presets

# Exact match only (memory + SQLite)
agent.with_cache(preset="basic")

# Semantic matching — "vegan food" hits cache for "plant-based meals"
agent.with_cache(preset="smart", similarity_threshold=0.9)

# Offline-friendly with filesystem backup and background sync
agent.with_cache(preset="offline")

# Redis-backed for distributed setups
agent.with_cache(preset="distributed", redis_url="redis://localhost:6379")

Conversation caching

Full conversation lists cache the same way strings do — deterministic serialization, same hash, same hit:

conversation = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there!"},
    {"role": "user", "content": "What's the weather?"},
]
cache.set(conversation, "my-bot", "Looks sunny today.")
cache.get(conversation, "my-bot")  # → "Looks sunny today."

The second parameter is a namespace — any string that scopes the cache. Usually the model name, but it can be anything.

How matching works

Strategy How When
Exact SHA-256 hash of prompt + namespace + temperature Default, always runs first
Semantic Cosine similarity of embedding vectors preset="smart" or higher, runs on exact miss

Semantic matching uses an embedding model (local all-MiniLM-L6-v2 or API-based nomic-embed-text) and only returns a hit when similarity exceeds the threshold (default 0.95).

Memory

agent.remember("Customer prefers email", importance=0.9)
memories = agent.recall(query="communication preferences")

SQLite-backed, searchable, persistent across sessions.

Skills

Load domain expertise on demand, either from local paths or GitHub:

from agentu import Agent, Skill

# From GitHub (cached locally at ~/.agentu/skills/)
agent = Agent("assistant").with_skills([
    "hemanth/agentu-skills/pdf-processor",
    "openai/skills/code-review@v1.0",
])

# From local
agent = Agent("assistant").with_skills(["./skills/my-skill"])

# Or define inline
pdf_skill = Skill(
    name="pdf-processing",
    description="Extract text and tables from PDF files",
    instructions="skills/pdf/SKILL.md",
    resources={"forms": "skills/pdf/FORMS.md"}
)
agent = Agent("assistant").with_skills([pdf_skill])

Skills load progressively: metadata first (100 chars), then instructions (1500 chars), then resources only when needed.

Sessions

Stateful conversations with automatic context:

from agentu import SessionManager

manager = SessionManager()
session = manager.create_session(agent)

await session.send("What's the weather in SF?")
await session.send("What about tomorrow?")  # knows you mean SF

Multi-user isolation, SQLite persistence, session timeout handling.

Evaluation

Test your agents with simple assertions:

from agentu import evaluate

test_cases = [
    {"ask": "What's 5 + 3?", "expect": 8},
    {"ask": "Weather in SF?", "expect": "sunny"}
]

results = await evaluate(agent, test_cases)
print(f"Accuracy: {results.accuracy}%")
print(results.to_json())  # export for CI/CD

Matching strategies: exact, substring, LLM-as-judge, or custom validators.

Observability

All LLM calls and tool executions are tracked automatically:

from agentu import Agent, observe

observe.configure(output="console")  # or "json" or "silent"

agent = Agent("assistant").with_tools([...])
await agent.infer("Find me laptops")

metrics = agent.observer.get_metrics()
# {"tool_calls": 3, "total_duration_ms": 1240, "errors": 0}

Dashboard

from agentu import serve

serve(agent, port=8000)
# http://localhost:8000/dashboard — live metrics
# http://localhost:8000/docs — auto-generated API docs

Notifications

Send low-latency, non-blocking alerts to Slack, Discord, Email, or SMS when an agent finishes its task.

pip install agentu[notify]
from agentu import Agent

# Attach notification middleware via the builder pattern
agent = Agent("my-bot").with_notifier([
    "slack://bot-token/channel-id",
    "discord://webhook_id/webhook_token"
])

# The agent executes without blocking, and posts a rich summary containing tokens and elapsed ms.
await agent.infer("Audit the database schema")

Custom Formatting & Failure Alerts

Notifications trigger natively on Agent crashes too (e.g. rate limits). If you want to format exactly how the alert looks for successes or failures, provide a custom formatter:

from agentu.middleware import NotifyMiddleware

def custom_format(context, response, error) -> str:
    if error:
        return f"🚨 AGENT CRASH 🚨\n{error}"
    return f"✅ Agent {context.namespace} finished in {context.elapsed_ms}ms"

# Fall back to base use() method to pass the custom formatter
agent.use(NotifyMiddleware(
    targets=["slack://bot-token/channel-id"], 
    formatter=custom_format
))

Ralph mode

Run agents in autonomous loops with progress tracking:

result = await agent.ralph(
    prompt_file="PROMPT.md",
    max_iterations=50,
    timeout_minutes=30,
    on_iteration=lambda i, data: print(f"[{i}] {data['result'][:50]}...")
)

The agent loops until all checkpoints in PROMPT.md are complete or limits are reached.

Tool search

When you have hundreds of tools, you don't want them all in context. Deferred tools are discovered on-demand:

agent = Agent("payments").with_tools(defer=[charge_card, send_receipt, refund_payment])

# Agent calls search_tools("charge card") → finds charge_card → executes it
result = await agent.infer("charge $50 to card_123")

A search_tools function is auto-added. The agent searches, activates, and calls — all internally.

MCP

Connect to Model Context Protocol servers:

agent = await Agent("bot").with_mcp(["http://localhost:3000"])
agent = await Agent("bot").with_mcp([
    {"url": "https://api.com/mcp", "headers": {"Auth": "Bearer xyz"}}
])

LLM support

Works with any OpenAI-compatible API. Auto-detects available models from Ollama:

Agent("assistant")                                        # first available Ollama model
Agent("assistant", model="qwen3")                         # specific model
Agent("assistant", model="gpt-4", api_key="sk-...")       # OpenAI
Agent("assistant", model="mistral", api_base="http://localhost:8000/v1")  # vLLM, LM Studio, etc.

REST API

from agentu import serve

serve(agent, port=8000, enable_cors=True)

Endpoints: /execute, /process, /tools, /memory/remember, /memory/recall, /docs

API reference

# Agent
agent = Agent(name)                       # auto-detect model
agent = Agent(name, model="qwen3")        # explicit model
agent = Agent(name, max_turns=5)          # limit multi-turn cycles
agent.with_tools([func1, func2])          # active tools
agent.with_tools(defer=[many_funcs])      # searchable tools
agent.with_cache(preset="smart")          # caching
agent.with_skills(["github/repo/skill"])  # skills
agent.with_notifier(["slack://bot-token"])       # notifications
await agent.with_mcp([url])              # MCP servers

await agent.call("tool", params)          # direct tool execution
await agent.infer("natural language")     # LLM-routed execution

agent.remember(content, importance=0.8)   # store memory
agent.recall(query)                       # search memory

# Sessions
manager = SessionManager()
session = manager.create_session(agent)
await session.send("message")
session.get_history(limit=10)
session.clear_history()

# Evaluation
results = await evaluate(agent, test_cases)
results.accuracy     # 95.0
results.to_json()    # export

# Workflows
step1 >> step2          # sequential
step1 & step2           # parallel
await workflow.run()    # execute

Examples

git clone https://github.com/hemanth/agentu && cd agentu

python examples/basic.py                # simple agent
python examples/workflow.py             # workflows
python examples/memory.py               # memory system
python examples/example_sessions.py     # stateful sessions
python examples/example_eval.py         # agent evaluation
python examples/example_observe.py      # observability
python examples/api.py                  # REST API

Testing

pytest
pytest --cov=agentu

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentu-1.16.0.tar.gz (370.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentu-1.16.0-py3-none-any.whl (88.7 kB view details)

Uploaded Python 3

File details

Details for the file agentu-1.16.0.tar.gz.

File metadata

  • Download URL: agentu-1.16.0.tar.gz
  • Upload date:
  • Size: 370.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agentu-1.16.0.tar.gz
Algorithm Hash digest
SHA256 3f193ef11708ec01a6736885438784066c9b1b62ae988dae93e400c1b58d550d
MD5 d556094960b9463165c092d07d5bf973
BLAKE2b-256 ad2e7c98de6896bddb78c8201706a8b53814cfb03bec070011cd50c51844f56d

See more details on using hashes here.

File details

Details for the file agentu-1.16.0-py3-none-any.whl.

File metadata

  • Download URL: agentu-1.16.0-py3-none-any.whl
  • Upload date:
  • Size: 88.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agentu-1.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb73fdbc80a5c90d5af82fe3beabeb01681746942f70a1832540fc15062c6a54
MD5 7480d59f369a8f806184c41ea240f0f4
BLAKE2b-256 691942aa239fa0c3a948a07849d1982d89692fcc5aa3a6c84f9ff335a0e12daf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page