A clean API for continual learning with LoRA models using reward-based feedback

Project description

Maximum Continual

A clean API for continual learning with LoRA models using reward-based feedback [[memory:6774509]].

Overview

Maximum Continual is a Python library that enables continuous learning for AI agents through:

Agent-based Architecture: Uses a code execution agent with tool access
LoRA Integration: Leverages Low-Rank Adaptation for efficient model fine-tuning
Modal Backend: Scalable cloud-based model hosting and training
Reward-based Learning: Updates models based on performance feedback
Tool System: Extensible framework for custom tools and capabilities

Quick Start

Installation

The project uses Poetry for dependency management [[memory:6774508]]:

pip install -e .

Basic Usage

from maximum_continual import MaximumContinual, Tool, MessageT
from maximum_continual.system_prompt import fetch_default_system_prompt
from maximum_continual.types import PredictionResponseWithRewardT
from pydantic import BaseModel

# Define a custom tool
class WebSearchTool(Tool):
    name = "web_search"
    description = "Performs a web search and returns results"
    inputs = {"query": {"type": "string", "description": "Search query"}}
    output_type = "string"
    
    def forward(self, query: str) -> str:
        # Implementation here
        return "Search results..."

# Define response structure
class FinalAnswer(BaseModel):
    answer: str
    reasoning: str

# Initialize client
client = MaximumContinual(auto_deploy=True)

# Create and use a model
with client.init_model(model_id="my_model") as model:
    tools = [WebSearchTool()]
    
    # Make a prediction
    response = model.predict(
        messages=[
            MessageT(role="system", content=fetch_default_system_prompt(
                tools, 
                additional_authorized_imports=["os"], 
                final_answer_model=FinalAnswer
            )),
            MessageT(role="user", content="Search for information about Python")
        ],
        final_answer_model=FinalAnswer,
        tools=tools,
        additional_authorized_imports=["os"]
    )
    
    # Update model with reward feedback
    model.update([
        PredictionResponseWithRewardT(
            prediction=response,
            reward=1.0  # Positive reward for good performance
        )
    ])

Architecture

Core Components

1. MaximumContinual Client

The main entry point that handles:

Modal backend deployment and management
Model lifecycle (initialization, loading, cleanup)
Backend health monitoring

2. Agent System

MaximumContinualAgent: Orchestrates the agent loop
CodeExecutorTool: Executes Python code with tool access
LocalPythonExecutor: Sandboxed Python execution environment

3. Backend Infrastructure

Modal Backend: Cloud-based model hosting and LoRA training
vLLM Backend: High-performance model inference
LoRA Management: Dynamic adapter loading and unloading

4. Tool System

Base Tools: Foundation for creating custom tools
Tool Validation: Ensures tool safety and compatibility
State Persistence: Tools maintain state across executions

How It Works

Model Initialization: Creates or loads an existing model with optional LoRA adapters
Agent Loop:
- Receives messages and available tools
- Uses code executor to run Python code
- Tools are accessible as Python functions within the execution environment
- Iterates until final answer is provided
Reward Learning: Models are updated based on prediction quality feedback
Continuous Improvement: LoRA adapters fine-tune model behavior over time

API Reference

MaximumContinual

Main client class for interacting with the system.

client = MaximumContinual(
    modal_app_name: str = "maximum-continual",
    auto_deploy: bool = True
)

Parameters:

modal_app_name: Name for the Modal application
auto_deploy: Whether to automatically deploy the Modal backend

MaximumContinualModel

Model instance for making predictions and updates.

predict()

response = model.predict(
    messages: List[MessageT],
    tools: List[Tool] = [],
    additional_authorized_imports: List[str] = [],
    final_answer_model: Optional[BaseModel] = None,
    **kwargs
) -> PredictionResponseT

Parameters:

messages: Conversation history
tools: Available tools for the agent
additional_authorized_imports: Python modules the agent can import
final_answer_model: Pydantic model for structured responses

update()

model.update(predictions: List[PredictionResponseWithRewardT]) -> None

Updates the model with reward feedback to improve future performance.

Tool Creation

Create custom tools by extending the Tool class:

class CustomTool(Tool):
    name = "custom_tool"
    description = "Description of what the tool does"
    inputs = {
        "param1": {"type": "string", "description": "Parameter description"},
        "param2": {"type": "integer", "description": "Another parameter"}
    }
    output_type = "string"
    
    def forward(self, param1: str, param2: int) -> str:
        # Tool implementation
        return "Result"

Types

Core Types

class MessageT(BaseModel):
    """Chat message format"""
    role: str
    content: str
    tool_calls: Optional[List[ToolCallT]] = None
    tool_call_id: Optional[str] = None

class PredictionResponseT(BaseModel):
    """Response from prediction"""
    final_response: BaseModel
    messages: List[MessageT]
    metadata: Optional[Dict[str, Any]] = None

class PredictionResponseWithRewardT(BaseModel):
    """Prediction with reward feedback"""
    prediction: PredictionResponseT
    reward: float

Advanced Usage

Custom System Prompts

from maximum_continual.system_prompt import fetch_default_system_prompt

system_prompt = fetch_default_system_prompt(
    tools=my_tools,
    authorized_imports=["requests", "json", "pandas"],
    final_answer_model=MyResponseModel
)

State Persistence

The code execution environment maintains state across calls:

# First execution: define variables
code1 = "data = {'count': 0}"

# Second execution: use previous variables  
code2 = "data['count'] += 1; print(data)"

Error Handling

Tools should implement proper error handling:

class SafeTool(Tool):
    def forward(self, input_data: str) -> str:
        try:
            # Tool logic here
            return result
        except Exception as e:
            return f"Error: {str(e)}"

Examples

Web Search Agent

See basic_example.py for a complete example of building a web search agent with reward-based learning.

Multi-Tool Agent

tools = [
    WebSearchTool(),
    DataAnalysisTool(), 
    FileProcessorTool()
]

response = model.predict(
    messages=[system_msg, user_msg],
    tools=tools,
    final_answer_model=MyAnswer
)

Custom Reward Functions

def calculate_reward(response: PredictionResponseT, expected: str) -> float:
    # Custom reward logic
    accuracy = calculate_accuracy(response.final_response, expected)
    return float(accuracy)

# Apply rewards
model.update([
    PredictionResponseWithRewardT(
        prediction=response,
        reward=calculate_reward(response, ground_truth)
    )
])

Modal Backend

The system automatically deploys and manages a Modal backend for:

Model hosting with vLLM
LoRA adapter training and storage
Scalable inference serving

Authentication required:

modal setup

Development

Testing

pytest tests/

Code Quality

black maximum_continual/
ruff check maximum_continual/
mypy maximum_continual/

Requirements

Python ≥3.12, <3.13
Modal account and authentication
CUDA-compatible GPU (for model training)

Dependencies

Key dependencies include:

modal: Cloud compute platform
transformers: Hugging Face model library
smolagents: Agent framework and code executor
litellm: Model inference abstraction
vllm: High-performance model serving
pydantic: Data validation and serialization

Project details

Release history Release notifications | RSS feed

0.1.7

Sep 11, 2025

0.1.6

Sep 8, 2025

0.1.5

Sep 8, 2025

0.1.4

Sep 8, 2025

This version

0.1.3

Sep 8, 2025

0.1.2

Aug 31, 2025

0.1.0

Aug 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maximum_continual-0.1.3.tar.gz (212.7 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

maximum_continual-0.1.3-py3-none-any.whl (78.1 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file maximum_continual-0.1.3.tar.gz.

File metadata

Download URL: maximum_continual-0.1.3.tar.gz
Upload date: Sep 8, 2025
Size: 212.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for maximum_continual-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`b4a2f97f1caa0aa533023c7532bf11ee3dcc2e3e257e53b54eaf435c8e6f46e3`
MD5	`fdafa4b6ac0c552efd84b24950bb998f`
BLAKE2b-256	`9f01626f3fee1b70a3e0e75eeade5d4a2edba9760bfa0766c7b0acf710250ed4`

See more details on using hashes here.

File details

Details for the file maximum_continual-0.1.3-py3-none-any.whl.

File metadata

Download URL: maximum_continual-0.1.3-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 78.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for maximum_continual-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4b910d4da8152cf5d20f9b7e32d69df7c5c701d86db189b3b081b7d5a8b516cf`
MD5	`9fd2224eb2a1652e65075e7756d4b2f7`
BLAKE2b-256	`560679e35f268d43249a5d690fd0558b771df98c43bf46079d7321ad3f472f9b`

See more details on using hashes here.

maximum-continual 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Maximum Continual

Overview

Quick Start

Installation

Basic Usage

Architecture

Core Components

1. MaximumContinual Client

2. Agent System

3. Backend Infrastructure

4. Tool System

How It Works

API Reference

MaximumContinual

MaximumContinualModel

predict()

update()

Tool Creation

Types

Core Types

Advanced Usage

Custom System Prompts

State Persistence

Error Handling

Examples

Web Search Agent

Multi-Tool Agent

Custom Reward Functions

Modal Backend

Development

Testing

Code Quality

Requirements

Dependencies

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes