A production-ready Python framework for building autonomous AI agents that can plan, validate, and execute complex tasks using LLMs and custom tools

These details have not been verified by PyPI

Project description

AgentV2 - Production-Ready Python Agent Framework

A modular, production-ready Python framework for building autonomous AI agents that can plan, validate, and execute complex tasks using LLMs and custom tools.

🎯 What is This Project?

AgentV2 is a deterministic, production-grade agent framework that separates concerns into distinct components:

Planner: Generates structured todo lists from natural language tasks
Validator: Ensures todos meet quality and domain-specific requirements
Executor: Deterministically executes todos one at a time using LLM-guided actions
Agent: High-level orchestrator that coordinates the entire workflow
Session Memory: Lightweight caching and context management for conversational agents

The framework enforces strict architectural boundaries, ensuring predictable execution, no silent failures, and deterministic outcomes.

🏗️ Architecture & Logic

High-Level Flow

graph TD
    A[User Task] --> B[Agent.run]
    B --> C[Planner]
    C --> D[Generate Todos]
    D --> E[Validator]
    E --> F{Valid?}
    F -->|No| G[Auto-Rewrite]
    G --> E
    F -->|Yes| H[Executor]
    H --> I[Execute Todos]
    I --> J[LLM Proposes Action]
    J --> K{Action Type}
    K -->|tool| L[Execute Tool]
    K -->|complete_todo| M[Mark Complete]
    K -->|think| N[Update State]
    L --> O[Update Memory]
    M --> O
    N --> O
    O --> P{All Done?}
    P -->|No| I
    P -->|Yes| Q[Summarize]
    Q --> R[Final Reply]

Component Responsibilities

graph LR
    subgraph "Agent (Orchestrator)"
        A1[Task Input] --> A2[Plan]
        A2 --> A3[Validate]
        A3 --> A4[Execute]
        A4 --> A5[Summarize]
    end
    
    subgraph "Planner"
        P1[Task] --> P2[LLM Generate]
        P2 --> P3[TodoList]
    end
    
    subgraph "Validator"
        V1[TodoList] --> V2[Base Rules]
        V2 --> V3[Domain Rules]
        V3 --> V4[Quality Score]
        V4 --> V5{Pass?}
        V5 -->|No| V6[Rewrite]
        V6 --> V1
        V5 -->|Yes| V7[Validated]
    end
    
    subgraph "Executor"
        E1[TodoList] --> E2[Iterate Todos]
        E2 --> E3[LLM Action]
        E3 --> E4{Action}
        E4 -->|tool| E5[Call Tool]
        E4 -->|complete| E6[Mark Done]
        E4 -->|think| E7[Update State]
        E5 --> E8[Update Memory]
        E6 --> E8
        E7 --> E8
        E8 --> E9{Next?}
        E9 -->|Yes| E2
        E9 -->|No| E10[Done]
    end
    
    A2 --> P1
    A3 --> V1
    A4 --> E1
    V7 --> A4
    E10 --> A5

Session Memory Flow

sequenceDiagram
    participant U as User
    participant A as Agent
    participant S as SessionStore
    participant C as Cache
    
    U->>A: chat("What's the weather?")
    A->>S: get(session_id)
    S->>C: check_cache(normalized_task)
    alt Cache Hit
        C-->>A: cached_reply
        A-->>U: cached_reply (no LLM call)
    else Cache Miss
        A->>A: plan + execute
        A->>C: cache_reply(task, reply)
        A-->>U: final_reply
    end

🚀 Quick Start

Installation

# Clone the repository
git clone <repository-url>
cd AgentV2

# Install dependencies (using uv)
uv sync

# Or using pip
pip install -r requirements.txt

Environment Setup

Create a .env file:

GROQ_API_KEY=your_groq_api_key_here

Basic Usage

Example 1: Simple Task Execution

from src.agent import Agent
import uuid

# Define tools
def add(a: int, b: int) -> int:
    return a + b

tools = {"add": add}

# Create agent
agent = Agent(
    model="groq/openai/gpt-oss-120b",
    system_prompt="You are an autonomous execution agent.",
    session_id=f"session-{uuid.uuid4().hex[:8]}",
    tools=tools,
)

# Run a task
result = agent.run("Add 5 and 10, then multiply by 2")
print(result.final_reply)

Example 2: Chat Agent with Web Search

from src.agent import Agent
from ddgs import DDGS
import uuid

def web_search(query: str) -> str:
    with DDGS() as ddgs:
        results = list(ddgs.text(query, max_results=5))
    return format_results(results)

tools = {"web_search": web_search}

agent = Agent(
    model="groq/openai/gpt-oss-120b",
    system_prompt="You are a helpful assistant with web search.",
    session_id=f"chat-{uuid.uuid4().hex[:8]}",
    tools=tools,
)

# Use chat API (with session memory)
reply = agent.chat("What's the latest news about AI?")
print(reply)

Example 3: File Operations Agent

from src.agent import Agent
from pathlib import Path
import uuid

def read_file(path: str) -> str:
    return Path(path).read_text()

def write_file(path: str, content: str) -> str:
    Path(path).write_text(content)
    return f"Wrote {len(content)} bytes to {path}"

tools = {
    "read_file": read_file,
    "write_file": write_file,
}

agent = Agent(
    model="groq/openai/gpt-oss-120b",
    system_prompt="You are a file operations agent.",
    session_id=f"fileops-{uuid.uuid4().hex[:8]}",
    tools=tools,
    domain_validator=None,  # Disable domain validation for file ops
)

result = agent.run("Create a hello.py file that prints 'Hello, World!'")
print(result.final_reply)

🎨 Key Features

1. Deterministic Execution

No unbounded loops
Strict step limits per todo
Predictable outcomes
No silent failures

2. Session Memory

Automatic caching of exact-match tasks
Context injection across multiple turns
Lightweight, token-efficient
Session-based isolation

3. Strict Validation

Base validation (action verbs, length, forbidden phrases)
Domain-specific validation (backend/frontend/data)
Quality scoring (0.0-1.0)
Auto-rewrite on failure (bounded attempts)

4. Tool Sandboxing

Tools provided as callables
Validated before execution
Exceptions propagate as RuntimeError
Results stored in authoritative memory

5. Rich Logging

Structured, box-formatted logs
Clear visual separation
Execution stats and progress tracking
Error reporting with context

🔧 Architecture Constraints

The framework enforces strict boundaries:

Planner decides todos - Executor never modifies the plan
LLM proposes actions - Only via AgentState schema
Memory is authoritative - Executor enforces all invariants
No retries in Agent - Failures propagate immediately
Tools are sandboxed - Validated and isolated
Deterministic execution - Same input → same output

📁 Project Structure

AgentV2/
├── src/
│   ├── agent.py          # High-level orchestrator
│   ├── planner.py        # Todo generation
│   ├── executor.py        # Deterministic execution
│   └── session_store.py  # Session memory management
├── schemas/
│   ├── AgentMemory.py    # Authoritative memory state
│   ├── AgentState.py     # LLM action proposals
│   ├── TodoSchema.py     # Todo data models
│   └── SessionMemory.py  # Session context model
├── utils/
│   ├── llm.py            # LLM interface (LiteLLM)
│   ├── logger.py         # Rich logging utilities
│   ├── validators.py     # Todo validation & scoring
│   └── Prompts.py        # Prompt template loader
├── prompts/
│   ├── Agent.md          # Execution prompt
│   ├── Todo.md           # Planning prompt
│   ├── FinalReply.md     # Summarization prompt
│   └── TodoRewrite.md    # Rewrite prompt
├── main.py               # Example: Basic tools
├── main2.py             # Example: Chat agent
├── main3.py             # Example: File operations
└── README.md

💡 Use Cases

1. Task Automation

Break down complex tasks into executable steps
Execute multi-step workflows deterministically
Handle file operations, API calls, data processing

2. Conversational Agents

Chat interfaces with web search
Context-aware responses
Caching for repeated queries
Session-based memory

3. Code Generation & File Operations

Generate code files from descriptions
Read and modify existing files
Execute and test generated code
Create full-stack applications

4. Data Processing Pipelines

Extract, transform, and load data
Validate and clean datasets
Generate reports and summaries

5. API Integration Agents

Interact with external APIs
Process web search results
Aggregate information from multiple sources

6. Development Assistants

Generate boilerplate code
Refactor existing codebases
Write tests and documentation
Debug and fix issues

🛠️ Customization

Adding Custom Tools

def my_custom_tool(param1: str, param2: int) -> str:
    """Tool description for the LLM."""
    # Your logic here
    return "result"

tools = {
    "my_custom_tool": my_custom_tool,
}

Custom Domain Validators

from utils.validators import DomainTodoValidator

class MyDomainValidator(DomainTodoValidator):
    FORBIDDEN = ["forbidden_term1", "forbidden_term2"]
    
    def validate(self, todo: TodoItemInput) -> None:
        # Your validation logic
        pass

agent = Agent(
    ...,
    domain_validator=MyDomainValidator(),
)

Custom Prompts

Edit the markdown files in prompts/:

Agent.md - Execution instructions
Todo.md - Planning instructions
FinalReply.md - Summarization instructions

📊 Execution Flow Details

Planning Phase

User provides task description
Planner generates TodoListInput using LLM
Validator checks base rules, domain rules, quality score
Auto-rewrite invalid todos (up to 2 attempts)
Return validated TodoList with UUIDs

Execution Phase

Executor iterates through todos sequentially
For each todo:
- LLM proposes AgentState (think/tool/complete_todo/fail_todo/noop)
- Validate JSON strictly
- Apply action deterministically
- Update AgentMemory (authoritative state)
- Enforce step limits (MAX_STEPS_PER_TODO)
Continue until all todos complete or fail

Summarization Phase

Collect completed todos and final results
Generate natural language summary
Return final reply to user

🔒 Security & Safety

Tool Sandboxing: Tools execute in controlled environment
Input Validation: All LLM outputs validated with Pydantic
Error Handling: No silent failures, all errors propagate
Step Limits: Bounded execution prevents infinite loops
Session Isolation: Each session_id has isolated memory

📝 Logging

The framework uses Rich for beautiful, structured logging:

Box-formatted panels for clear separation
Color-coded success/error/warning messages
Execution stats tables
Todo lists with status indicators
Structured logs to files in logs/ directory

🤝 Contributing

This is a production-ready framework with strict architectural constraints. When contributing:

Maintain separation of concerns (Planner/Executor/Agent)
Never add retry logic in Agent
Always validate LLM outputs
Keep execution deterministic
Add tests for new features

📄 License

[Add your license here]

🙏 Acknowledgments

Built with LiteLLM for LLM abstraction
Uses Rich for beautiful terminal output
DuckDuckGo Search for web search capabilities

Note: This framework is designed for production use with strict architectural boundaries. Ensure you understand the constraints before extending functionality. sequenceDiagram participant User participant Agent participant SessionMemory participant AgentMemory participant Executor participant LLM

User->>Agent: run("task")
Agent->>SessionMemory: get_cached_reply(normalized_task)
alt Cache Hit
    SessionMemory-->>Agent: cached_reply
    Agent-->>User: AgentMemory(final_reply=cached_reply)
else Cache Miss
    Agent->>AgentMemory: new AgentMemory(todos=...)
    Agent->>Executor: run(memory)
    loop For each todo
        Executor->>LLM: propose action
        LLM-->>Executor: AgentState
        Executor->>AgentMemory: append_state(state)
        Executor->>AgentMemory: update last_tool/result
        Executor->>AgentMemory: mark todo complete
    end
    Executor-->>Agent: updated AgentMemory
    Agent->>AgentMemory: generate final_reply
    Agent->>SessionMemory: cache_reply(task, reply)
    Agent-->>User: AgentMemory with final_reply
end

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

Jan 25, 2026

0.1.3

Jan 15, 2026

0.1.2

Jan 15, 2026

0.1.1

Jan 15, 2026

This version

0.1.0

Jan 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentv2-0.1.0.tar.gz (31.3 kB view details)

Uploaded Jan 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentv2-0.1.0-py3-none-any.whl (33.8 kB view details)

Uploaded Jan 15, 2026 Python 3

File details

Details for the file agentv2-0.1.0.tar.gz.

File metadata

Download URL: agentv2-0.1.0.tar.gz
Upload date: Jan 15, 2026
Size: 31.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for agentv2-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`10ebb804043c881e1b2a3273ada2fd0664d333711e2ea91f3210b2d5d9fa8021`
MD5	`7c821aad96c2e6b9fa3833213dafdf39`
BLAKE2b-256	`a7e4f320f4104cfd97fedf9e5d974dfd25fa6ab6c213f10a9e0e12ea79fa52f3`

See more details on using hashes here.

File details

Details for the file agentv2-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentv2-0.1.0-py3-none-any.whl
Upload date: Jan 15, 2026
Size: 33.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for agentv2-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5dece8b54dc89767950ccba571f8912a41db3de3790705e200b792fd66bbcea9`
MD5	`5a303c5290481117cefb538dceac08c8`
BLAKE2b-256	`dd8be540515d85ab936f8017cec20d5657a1ff610d657dbbbe885df48bdfdfd5`

See more details on using hashes here.

agentv2 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

AgentV2 - Production-Ready Python Agent Framework

🎯 What is This Project?

🏗️ Architecture & Logic

High-Level Flow

Component Responsibilities

Session Memory Flow

🚀 Quick Start

Installation

Environment Setup

Basic Usage

Example 1: Simple Task Execution

Example 2: Chat Agent with Web Search

Example 3: File Operations Agent

🎨 Key Features

1. Deterministic Execution

2. Session Memory

3. Strict Validation

4. Tool Sandboxing

5. Rich Logging

🔧 Architecture Constraints

📁 Project Structure

💡 Use Cases

1. Task Automation

2. Conversational Agents

3. Code Generation & File Operations

4. Data Processing Pipelines

5. API Integration Agents

6. Development Assistants

🛠️ Customization

Adding Custom Tools

Custom Domain Validators

Custom Prompts

📊 Execution Flow Details

Planning Phase

Execution Phase

Summarization Phase

🔒 Security & Safety

📝 Logging

🤝 Contributing

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes