Minimal experimental framework for building agents with local LLM deployments. Zero bloat, maximum simplicity.
Project description
agents
⚠️ EXPERIMENTAL MINIMAL AGENT FRAMEWORK ⚠️
A minimal, experimental framework for building agents with local LLM deployments. Designed for minimal overhead and maximum simplicity. Currently configured for Ollama with full implementation. Includes stubs for other local LLM providers.
🎯 Project Status
Current State: Experimental / Early Development
- ✅ Ollama: Fully implemented and tested
- 📝 llama.cpp: Stub implementation (contributions welcome)
- 📝 vLLM: Stub implementation (contributions welcome)
- 📝 Text Generation WebUI: Stub implementation (contributions welcome)
- 📝 LocalAI: Stub implementation (contributions welcome)
- 📝 LM Studio: Stub implementation (contributions welcome)
🚀 Quick Start (Ollama)
from agents import LocalOllamaClient, ChatAgent
# Create client
client = LocalOllamaClient(
model_name="llama3:latest",
api_base="http://localhost:11434"
)
# Create agent
agent = ChatAgent(client)
# Get response
response = agent.get_full_response([
{"role": "user", "content": "What is Python?"}
])
print(response)
📦 Installation
cd agents
pip install -e .
# With distributed support (SOLLOL)
pip install -e ".[distributed]"
✨ Features
Minimal by Design:
- Zero Bloat: Direct API calls with clean abstractions
- Minimal Dependencies: Only aiohttp and numpy (core)
- Simple Architecture: Easy to understand and extend
- No Magic: Explicit, straightforward code
Core Capabilities:
- Provider Abstraction: Common interface for all local LLM deployments
- Built-in Agents: Pre-configured agents for common tasks (optional)
- Streaming Support: Native async streaming
- Distributed Mode: Optional SOLLOL integration for Ollama clusters
- Type-Safe: Full type hints throughout
🏗️ Architecture
Base Client Interface
All providers implement BaseOllamaClient:
class BaseOllamaClient(abc.ABC):
@abc.abstractmethod
async def generate_embedding(text: str) -> List[float]:
...
@abc.abstractmethod
async def chat(messages, **options) -> AsyncGenerator[ChatResponse]:
...
Implemented Providers
Ollama (Fully Implemented)
from agents import LocalOllamaClient, DistributedOllamaClient
# Single node
client = LocalOllamaClient(
model_name="llama3:latest",
api_base="http://localhost:11434"
)
# Distributed (with SOLLOL)
client = DistributedOllamaClient(
model_name="llama3:latest",
nodes=["http://node1:11434", "http://node2:11434"]
)
Stub Providers (Ready for Implementation)
See agents/providers.py for stub implementations:
from agents import (
LlamaCppClient, # llama.cpp server
VLLMClient, # vLLM deployment
TextGenWebUIClient, # Oobabooga
LocalAIClient, # LocalAI
LMStudioClient, # LM Studio
)
Note: These will raise NotImplementedError until implemented. Contributions welcome!
🤖 Built-in Agents
- ChatAgent: Basic conversational agent
- CodingAgent: Code generation specialist
- ReasoningAgent: Analytical tasks
- ResearchAgent: Information synthesis
- SummarizationAgent: Text summarization
- EmbeddingAgent: Generate embeddings
Custom Agents
from agents import BaseAgent
class SQLAgent(BaseAgent):
system_prompt = "You are an expert SQL developer..."
agent = SQLAgent(client)
📖 Documentation
- Quick Start - 5-minute tutorial
- Integration Guide - Use in applications
- Architecture - Design details
- Examples - Working code samples
🧪 Examples
# Test framework
python test_as_library.py
# Example application
python example_app.py
# Individual examples
python agents/examples/basic_chat.py
python agents/examples/coding_agent.py
python agents/examples/embeddings.py
python agents/examples/multi_agent_workflow.py
🎯 Use Cases
Web Applications
from flask import Flask, jsonify, request
from agents import LocalOllamaClient, ChatAgent
app = Flask(__name__)
client = LocalOllamaClient("llama3:latest", "http://localhost:11434")
agent = ChatAgent(client)
@app.route('/chat', methods=['POST'])
def chat():
message = request.json['message']
response = agent.get_full_response([
{"role": "user", "content": message}
])
return jsonify({"response": response})
Command-Line Tools
from agents import LocalOllamaClient, ChatAgent
import sys
client = LocalOllamaClient("llama3:latest", "http://localhost:11434")
agent = ChatAgent(client)
query = " ".join(sys.argv[1:])
print(agent.get_full_response([{"role": "user", "content": query}]))
Background Workers
from celery import Celery
from agents import LocalOllamaClient, ResearchAgent
app = Celery('tasks', broker='redis://localhost:6379')
client = LocalOllamaClient("llama3:latest", "http://localhost:11434")
agent = ResearchAgent(client)
@app.task
def research_task(topic):
return agent.get_full_response([
{"role": "user", "content": f"Research: {topic}"}
])
🔧 Contributing
We welcome contributions, especially for implementing new LLM providers!
Implementing a New Provider
- See
agents/providers.pyfor stub implementations - Implement
BaseOllamaClientinterface:async def generate_embedding(text: str) -> List[float]async def chat(messages, **options) -> AsyncGenerator[ChatResponse]
- Follow the pattern in
LocalOllamaClient(seeagents/ollama_framework.py) - Add tests and examples
- Submit a pull request
Priority Providers
- llama.cpp - Lightweight C++ implementation
- vLLM - High-throughput serving
- LocalAI - OpenAI drop-in replacement
- Text Generation WebUI - Popular gradio interface
- LM Studio - User-friendly desktop app
⚠️ Important Notes
Experimental Status
This is an experimental framework in early development:
- APIs may change without notice
- Not recommended for production use yet
- Limited testing and documentation
- Breaking changes possible
Current Limitations
- Only Ollama is fully implemented
- Other providers are stubs requiring implementation
- Limited error handling in some edge cases
- Documentation may be incomplete
Roadmap
- Implement llama.cpp client
- Implement vLLM client
- Implement LocalAI client
- Add comprehensive testing suite
- Add more utility functions
- Improve error handling
- Add caching layer
- Support multi-modal (images, audio)
- Function calling support
📁 Project Structure
OllamaAgent/
├── agents/ # Main package
│ ├── __init__.py # Public API
│ ├── ollama_framework.py # Ollama clients (implemented)
│ ├── providers.py # Other provider stubs
│ ├── agents.py # Agent classes
│ ├── utils.py # Utility functions
│ └── examples/ # Usage examples
├── setup.py # Package installation
├── requirements.txt # Dependencies
├── example_app.py # Full application example
├── test_as_library.py # Library usage tests
├── README.md # This file
├── QUICKSTART.md # Quick start guide
├── INTEGRATION_GUIDE.md # Integration patterns
└── ARCHITECTURE.md # Design documentation
🧩 Dependencies
Core (minimal):
aiohttp>=3.8.0- Async HTTP clientnumpy>=1.20.0- Vector operations
Optional:
sollol>=0.1.0- Distributed Ollama load balancing
📄 License
MIT License - See LICENSE file for details
🤝 Support & Community
- Issues: Report bugs or request features on GitHub
- Examples: Check
agents/examples/for working code - Documentation: Read guides in
QUICKSTART.mdandINTEGRATION_GUIDE.md - Contributing: See contributing guidelines above
🎓 Related Projects
Built with patterns from:
- Hydra - Advanced reasoning engine
- SynapticLlamas - Multi-agent orchestration
- FlockParser - Document processing with RAG
- SOLLOL - Intelligent Ollama load balancer
⚡ Performance
Ollama Implementation:
- Direct HTTP calls (~1-5ms overhead)
- Connection pooling via aiohttp
- Streaming support (no buffering)
- Optional distributed mode with SOLLOL
🔒 Security
Local Deployment Focus:
- No API keys required
- All models run locally
- Full control over data
- No external API calls (except optional SOLLOL for Ollama clusters)
Status: Experimental Minimal Framework | Philosophy: Zero bloat, maximum simplicity | Primary Use: Ollama deployments | Looking for: Contributors to implement other providers!
Made for the local LLM community ❤️ | Keep it minimal, keep it simple.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_agents-0.1.0.tar.gz.
File metadata
- Download URL: local_agents-0.1.0.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77a643318fc7c11fa461b66922610567e0b095dc117f95c11148ca3dcc8dbf1e
|
|
| MD5 |
8682f5262a0eb8024a8d176611143757
|
|
| BLAKE2b-256 |
0c51b88900152ae6af595adaa81f703384d36ac9adf08a078742d7f7c20c0ef7
|
File details
Details for the file local_agents-0.1.0-py3-none-any.whl.
File metadata
- Download URL: local_agents-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
441b3028081f7f98f7667753c9b3c5e10fdc1a19e8c1a41fb822303928e28643
|
|
| MD5 |
897be45b0b4830676a562f89f71eae7f
|
|
| BLAKE2b-256 |
1e90b623ea45020bcd278308284b5935eb5965a48f7e0770afb38585786b02d7
|