A clean API for continual learning with LoRA models using reward-based feedback
Project description
Maximum Continual
A clean API for continual learning with LoRA models using reward-based feedback [[memory:6774509]].
Overview
Maximum Continual is a Python library that enables continuous learning for AI agents through:
- Agent-based Architecture: Uses a code execution agent with tool access
- LoRA Integration: Leverages Low-Rank Adaptation for efficient model fine-tuning
- Modal Backend: Scalable cloud-based model hosting and training
- Reward-based Learning: Updates models based on performance feedback
- Tool System: Extensible framework for custom tools and capabilities
Quick Start
Installation
The project uses Poetry for dependency management [[memory:6774508]]:
pip install -e .
Basic Usage
from maximum_continual import MaximumContinual, Tool, MessageT
from maximum_continual.system_prompt import fetch_default_system_prompt
from maximum_continual.types import PredictionResponseWithRewardT
from pydantic import BaseModel
# Define a custom tool
class WebSearchTool(Tool):
name = "web_search"
description = "Performs a web search and returns results"
inputs = {"query": {"type": "string", "description": "Search query"}}
output_type = "string"
def forward(self, query: str) -> str:
# Implementation here
return "Search results..."
# Define response structure
class FinalAnswer(BaseModel):
answer: str
reasoning: str
# Initialize client
client = MaximumContinual(auto_deploy=True)
# Create and use a model
with client.init_model(model_id="my_model") as model:
tools = [WebSearchTool()]
# Make a prediction
response = model.predict(
messages=[
MessageT(role="system", content=fetch_default_system_prompt(
tools,
additional_authorized_imports=["os"],
final_answer_model=FinalAnswer
)),
MessageT(role="user", content="Search for information about Python")
],
final_answer_model=FinalAnswer,
tools=tools,
additional_authorized_imports=["os"]
)
# Update model with reward feedback
model.update([
PredictionResponseWithRewardT(
prediction=response,
reward=1.0 # Positive reward for good performance
)
])
Architecture
Core Components
1. MaximumContinual Client
The main entry point that handles:
- Modal backend deployment and management
- Model lifecycle (initialization, loading, cleanup)
- Backend health monitoring
2. Agent System
- MaximumContinualAgent: Orchestrates the agent loop
- CodeExecutorTool: Executes Python code with tool access
- LocalPythonExecutor: Sandboxed Python execution environment
3. Backend Infrastructure
- Modal Backend: Cloud-based model hosting and LoRA training
- vLLM Backend: High-performance model inference
- LoRA Management: Dynamic adapter loading and unloading
4. Tool System
- Base Tools: Foundation for creating custom tools
- Tool Validation: Ensures tool safety and compatibility
- State Persistence: Tools maintain state across executions
How It Works
- Model Initialization: Creates or loads an existing model with optional LoRA adapters
- Agent Loop:
- Receives messages and available tools
- Uses code executor to run Python code
- Tools are accessible as Python functions within the execution environment
- Iterates until final answer is provided
- Reward Learning: Models are updated based on prediction quality feedback
- Continuous Improvement: LoRA adapters fine-tune model behavior over time
API Reference
MaximumContinual
Main client class for interacting with the system.
client = MaximumContinual(
modal_app_name: str = "maximum-continual",
auto_deploy: bool = True
)
Parameters:
modal_app_name: Name for the Modal applicationauto_deploy: Whether to automatically deploy the Modal backend
MaximumContinualModel
Model instance for making predictions and updates.
predict()
response = model.predict(
messages: List[MessageT],
tools: List[Tool] = [],
additional_authorized_imports: List[str] = [],
final_answer_model: Optional[BaseModel] = None,
**kwargs
) -> PredictionResponseT
Parameters:
messages: Conversation historytools: Available tools for the agentadditional_authorized_imports: Python modules the agent can importfinal_answer_model: Pydantic model for structured responses
update()
model.update(predictions: List[PredictionResponseWithRewardT]) -> None
Updates the model with reward feedback to improve future performance.
Tool Creation
Create custom tools by extending the Tool class:
class CustomTool(Tool):
name = "custom_tool"
description = "Description of what the tool does"
inputs = {
"param1": {"type": "string", "description": "Parameter description"},
"param2": {"type": "integer", "description": "Another parameter"}
}
output_type = "string"
def forward(self, param1: str, param2: int) -> str:
# Tool implementation
return "Result"
Types
Core Types
class MessageT(BaseModel):
"""Chat message format"""
role: str
content: str
tool_calls: Optional[List[ToolCallT]] = None
tool_call_id: Optional[str] = None
class PredictionResponseT(BaseModel):
"""Response from prediction"""
final_response: BaseModel
messages: List[MessageT]
metadata: Optional[Dict[str, Any]] = None
class PredictionResponseWithRewardT(BaseModel):
"""Prediction with reward feedback"""
prediction: PredictionResponseT
reward: float
Advanced Usage
Custom System Prompts
from maximum_continual.system_prompt import fetch_default_system_prompt
system_prompt = fetch_default_system_prompt(
tools=my_tools,
authorized_imports=["requests", "json", "pandas"],
final_answer_model=MyResponseModel
)
State Persistence
The code execution environment maintains state across calls:
# First execution: define variables
code1 = "data = {'count': 0}"
# Second execution: use previous variables
code2 = "data['count'] += 1; print(data)"
Error Handling
Tools should implement proper error handling:
class SafeTool(Tool):
def forward(self, input_data: str) -> str:
try:
# Tool logic here
return result
except Exception as e:
return f"Error: {str(e)}"
Examples
Web Search Agent
See basic_example.py for a complete example of building a web search agent with reward-based learning.
Multi-Tool Agent
tools = [
WebSearchTool(),
DataAnalysisTool(),
FileProcessorTool()
]
response = model.predict(
messages=[system_msg, user_msg],
tools=tools,
final_answer_model=MyAnswer
)
Custom Reward Functions
def calculate_reward(response: PredictionResponseT, expected: str) -> float:
# Custom reward logic
accuracy = calculate_accuracy(response.final_response, expected)
return float(accuracy)
# Apply rewards
model.update([
PredictionResponseWithRewardT(
prediction=response,
reward=calculate_reward(response, ground_truth)
)
])
Modal Backend
The system automatically deploys and manages a Modal backend for:
- Model hosting with vLLM
- LoRA adapter training and storage
- Scalable inference serving
Authentication required:
modal setup
Development
Testing
pytest tests/
Code Quality
black maximum_continual/
ruff check maximum_continual/
mypy maximum_continual/
Requirements
- Python ≥3.12, <3.13
- Modal account and authentication
- CUDA-compatible GPU (for model training)
Dependencies
Key dependencies include:
modal: Cloud compute platformtransformers: Hugging Face model librarysmolagents: Agent framework and code executorlitellm: Model inference abstractionvllm: High-performance model servingpydantic: Data validation and serialization
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file maximum_continual-0.1.5.tar.gz.
File metadata
- Download URL: maximum_continual-0.1.5.tar.gz
- Upload date:
- Size: 212.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b811acb7728dfc6a57a91003eef9c5ee17a31b0c69f7c15e5ce3a7f4f52e2e6
|
|
| MD5 |
fd2f575451be3d9f4a9c74da394c1451
|
|
| BLAKE2b-256 |
3d14cbe7f3723ed579bd93d507d84fabda82140c44bf0cb070230bd5ac721516
|
File details
Details for the file maximum_continual-0.1.5-py3-none-any.whl.
File metadata
- Download URL: maximum_continual-0.1.5-py3-none-any.whl
- Upload date:
- Size: 77.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f70052d555dd718f72571dd876adeaa5501cbd6d36a88fab99f74a6eb49fd1b
|
|
| MD5 |
3dcfcdcc0e5c1baf8e70ead7c579c30e
|
|
| BLAKE2b-256 |
96b2ffde7b98de9c5aaf0599f0141c1b424934abd13975eb4db5a2ac7c2e9e7a
|