A production-ready Python framework for building autonomous AI agents that can plan, validate, and execute complex tasks using LLMs and custom tools
Project description
AgentV2 - Production-Ready Python Agent Framework
A modular, production-ready Python framework for building autonomous AI agents that can plan, validate, and execute complex tasks using LLMs and custom tools.
๐ฆ Install: pip install agentv2 | ๐ PyPI: pypi.org/project/agentv2
๐ฏ What is This Project?
AgentV2 is a deterministic, production-grade agent framework that separates concerns into distinct components:
- Planner: Generates structured todo lists from natural language tasks
- Validator: Ensures todos meet quality and domain-specific requirements
- Executor: Deterministically executes todos one at a time using LLM-guided actions
- Agent: High-level orchestrator that coordinates the entire workflow
- Session Memory: Lightweight caching and context management for conversational agents
The framework enforces strict architectural boundaries, ensuring predictable execution, no silent failures, and deterministic outcomes.
๐ Quick Start
Installation
Install from PyPI (Recommended):
pip install agentv2
๐ฆ Package: agentv2 on PyPI
Or install from source (Development):
# Clone the repository
git clone <repository-url>
cd AgentV2
# Install in editable mode
pip install -e .
# Or install with development dependencies
pip install -e ".[dev]"
Environment Setup
Create a .env file:
API_KEY=your_api_key_here
Basic Usage
Example 1: Simple Task Execution
from agentv2.src.agent import Agent
import uuid
# Define tools
def add(a: int, b: int) -> int:
return a + b
tools = {"add": add}
# Create agent
agent = Agent(
model="groq/openai/gpt-oss-120b",
system_prompt="You are an autonomous execution agent.",
session_id=f"session-{uuid.uuid4().hex[:8]}",
tools=tools,
)
# Run a task
result = agent.run("Add 5 and 10, then multiply by 2")
print(result.final_reply)
Example 2: Chat Agent with Web Search
from agentv2.src.agent import Agent
from ddgs import DDGS
import uuid
def web_search(query: str) -> str:
with DDGS() as ddgs:
results = list(ddgs.text(query, max_results=5))
return format_results(results)
tools = {"web_search": web_search}
agent = Agent(
model="groq/openai/gpt-oss-120b",
system_prompt="You are a helpful assistant with web search.",
session_id=f"chat-{uuid.uuid4().hex[:8]}",
tools=tools,
)
# Use chat API (with session memory)
reply = agent.chat("What's the latest news about AI?")
print(reply)
Example 3: File Operations Agent
from agentv2.src.agent import Agent
from pathlib import Path
import uuid
def read_file(path: str) -> str:
return Path(path).read_text()
def write_file(path: str, content: str) -> str:
Path(path).write_text(content)
return f"Wrote {len(content)} bytes to {path}"
tools = {
"read_file": read_file,
"write_file": write_file,
}
agent = Agent(
model="groq/openai/gpt-oss-120b",
system_prompt="You are a file operations agent.",
session_id=f"fileops-{uuid.uuid4().hex[:8]}",
tools=tools,
domain_validator=None, # Disable domain validation for file ops
)
result = agent.run("Create a hello.py file that prints 'Hello, World!'")
print(result.final_reply)
๐จ Key Features
1. Deterministic Execution
- No unbounded loops
- Strict step limits per todo
- Predictable outcomes
- No silent failures
2. Session Memory
- Automatic caching of exact-match tasks
- Context injection across multiple turns
- Lightweight, token-efficient
- Session-based isolation
3. Strict Validation
- Base validation (action verbs, length, forbidden phrases)
- Domain-specific validation (backend/frontend/data)
- Quality scoring (0.0-1.0)
- Auto-rewrite on failure (bounded attempts)
4. Tool Sandboxing
- Tools provided as callables
- Validated before execution
- Exceptions propagate as RuntimeError
- Results stored in authoritative memory
5. Rich Logging
- Structured, box-formatted logs
- Clear visual separation
- Execution stats and progress tracking
- Error reporting with context
๐ง Architecture Constraints
The framework enforces strict boundaries:
- Planner decides todos - Executor never modifies the plan
- LLM proposes actions - Only via
AgentStateschema - Memory is authoritative - Executor enforces all invariants
- No retries in Agent - Failures propagate immediately
- Tools are sandboxed - Validated and isolated
- Deterministic execution - Same input โ same output
๐ Project Structure
AgentV2/
โโโ src/
โ โโโ agent.py # High-level orchestrator
โ โโโ planner.py # Todo generation
โ โโโ executor.py # Deterministic execution
โ โโโ session_store.py # Session memory management
โโโ schemas/
โ โโโ AgentMemory.py # Authoritative memory state
โ โโโ AgentState.py # LLM action proposals
โ โโโ TodoSchema.py # Todo data models
โ โโโ SessionMemory.py # Session context model
โโโ utils/
โ โโโ llm.py # LLM interface (LiteLLM)
โ โโโ logger.py # Rich logging utilities
โ โโโ validators.py # Todo validation & scoring
โ โโโ Prompts.py # Prompt template loader
โโโ prompts/
โ โโโ Agent.md # Execution prompt
โ โโโ Todo.md # Planning prompt
โ โโโ FinalReply.md # Summarization prompt
โ โโโ TodoRewrite.md # Rewrite prompt
โโโ main.py # Example: Basic tools
โโโ main2.py # Example: Chat agent
โโโ main3.py # Example: File operations
โโโ README.md
๐ก Use Cases
1. Task Automation
- Break down complex tasks into executable steps
- Execute multi-step workflows deterministically
- Handle file operations, API calls, data processing
2. Conversational Agents
- Chat interfaces with web search
- Context-aware responses
- Caching for repeated queries
- Session-based memory
3. Code Generation & File Operations
- Generate code files from descriptions
- Read and modify existing files
- Execute and test generated code
- Create full-stack applications
4. Data Processing Pipelines
- Extract, transform, and load data
- Validate and clean datasets
- Generate reports and summaries
5. API Integration Agents
- Interact with external APIs
- Process web search results
- Aggregate information from multiple sources
6. Development Assistants
- Generate boilerplate code
- Refactor existing codebases
- Write tests and documentation
- Debug and fix issues
๐ ๏ธ Customization
Adding Custom Tools
def my_custom_tool(param1: str, param2: int) -> str:
"""Tool description for the LLM."""
# Your logic here
return "result"
tools = {
"my_custom_tool": my_custom_tool,
}
Custom Domain Validators
from agentv2.utils.validators import DomainTodoValidator
class MyDomainValidator(DomainTodoValidator):
FORBIDDEN = ["forbidden_term1", "forbidden_term2"]
def validate(self, todo: TodoItemInput) -> None:
# Your validation logic
pass
agent = Agent(
...,
domain_validator=MyDomainValidator(),
)
Custom Prompts
Edit the markdown files in prompts/:
Agent.md- Execution instructionsTodo.md- Planning instructionsFinalReply.md- Summarization instructions
๐ Execution Flow Details
Planning Phase
- User provides task description
- Planner generates
TodoListInputusing LLM - Validator checks base rules, domain rules, quality score
- Auto-rewrite invalid todos (up to 2 attempts)
- Return validated
TodoListwith UUIDs
Execution Phase
- Executor iterates through todos sequentially
- For each todo:
- LLM proposes
AgentState(think/tool/complete_todo/fail_todo/noop) - Validate JSON strictly
- Apply action deterministically
- Update
AgentMemory(authoritative state) - Enforce step limits (MAX_STEPS_PER_TODO)
- LLM proposes
- Continue until all todos complete or fail
Summarization Phase
- Collect completed todos and final results
- Generate natural language summary
- Return final reply to user
๐ Security & Safety
- Tool Sandboxing: Tools execute in controlled environment
- Input Validation: All LLM outputs validated with Pydantic
- Error Handling: No silent failures, all errors propagate
- Step Limits: Bounded execution prevents infinite loops
- Session Isolation: Each session_id has isolated memory
๐ Logging
The framework uses Rich for beautiful, structured logging:
- Box-formatted panels for clear separation
- Color-coded success/error/warning messages
- Execution stats tables
- Todo lists with status indicators
- Structured logs to files in
logs/directory
๐๏ธ Architecture & Logic
High-Level Flow
graph TD
A[User Task] --> B[Agent.run]
B --> C[Planner]
C --> D[Generate Todos]
D --> E[Validator]
E --> F{Valid?}
F -->|No| G[Auto-Rewrite]
G --> E
F -->|Yes| H[Executor]
H --> I[Execute Todos]
I --> J[LLM Proposes Action]
J --> K{Action Type}
K -->|tool| L[Execute Tool]
K -->|complete_todo| M[Mark Complete]
K -->|think| N[Update State]
L --> O[Update Memory]
M --> O
N --> O
O --> P{All Done?}
P -->|No| I
P -->|Yes| Q[Summarize]
Q --> R[Final Reply]
Component Responsibilities
graph LR
subgraph "Agent (Orchestrator)"
A1[Task Input] --> A2[Plan]
A2 --> A3[Validate]
A3 --> A4[Execute]
A4 --> A5[Summarize]
end
subgraph "Planner"
P1[Task] --> P2[LLM Generate]
P2 --> P3[TodoList]
end
subgraph "Validator"
V1[TodoList] --> V2[Base Rules]
V2 --> V3[Domain Rules]
V3 --> V4[Quality Score]
V4 --> V5{Pass?}
V5 -->|No| V6[Rewrite]
V6 --> V1
V5 -->|Yes| V7[Validated]
end
subgraph "Executor"
E1[TodoList] --> E2[Iterate Todos]
E2 --> E3[LLM Action]
E3 --> E4{Action}
E4 -->|tool| E5[Call Tool]
E4 -->|complete| E6[Mark Done]
E4 -->|think| E7[Update State]
E5 --> E8[Update Memory]
E6 --> E8
E7 --> E8
E8 --> E9{Next?}
E9 -->|Yes| E2
E9 -->|No| E10[Done]
end
A2 --> P1
A3 --> V1
A4 --> E1
V7 --> A4
E10 --> A5
Session Memory Flow
sequenceDiagram
participant U as User
participant A as Agent
participant S as SessionStore
participant C as Cache
U->>A: chat("What's the weather?")
A->>S: get(session_id)
S->>C: check_cache(normalized_task)
alt Cache Hit
C-->>A: cached_reply
A-->>U: cached_reply (no LLM call)
else Cache Miss
A->>A: plan + execute
A->>C: cache_reply(task, reply)
A-->>U: final_reply
end
๐ค Contributing
This is a production-ready framework with strict architectural constraints. When contributing:
- Maintain separation of concerns (Planner/Executor/Agent)
- Never add retry logic in Agent
- Always validate LLM outputs
- Keep execution deterministic
- Add tests for new features
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Built with LiteLLM for LLM abstraction
- Uses Rich for beautiful terminal output
- DuckDuckGo Search for web search capabilities
Made With Hate and Love Because perfection isn't born out of love, it's forged in frustration, obsession, and an unrelenting pursuit of something better.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentv2-0.1.2.tar.gz.
File metadata
- Download URL: agentv2-0.1.2.tar.gz
- Upload date:
- Size: 31.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fda5a6fcfe0e2e9d7d47c3671b5daf16feae05ace0b5b6e016c04f207f3fbae3
|
|
| MD5 |
0d9dc190ef203d7d374d10919685e53c
|
|
| BLAKE2b-256 |
2329fd9e08471a13795973bd71a8a71caa091e7415c2a946519cb5e5e023b324
|
File details
Details for the file agentv2-0.1.2-py3-none-any.whl.
File metadata
- Download URL: agentv2-0.1.2-py3-none-any.whl
- Upload date:
- Size: 39.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9700051958be5a30fa0501f421363db2700c40a3d0988af64711928533ae5a21
|
|
| MD5 |
7def277a9f101592f1b8e28ac9cb268c
|
|
| BLAKE2b-256 |
138468a970f9292511cafec1ec9d7d12ad087e3dcda13b71706f03c63d7edb7d
|