Skip to main content

Minimal experimental framework for building agents with local LLM deployments. Zero bloat, maximum simplicity.

Project description

agents

⚠️ EXPERIMENTAL MINIMAL AGENT FRAMEWORK ⚠️

A minimal, experimental framework for building agents with local LLM deployments. Designed for minimal overhead and maximum simplicity. Currently configured for Ollama with full implementation. Includes stubs for other local LLM providers.

🎯 Project Status

Current State: Experimental / Early Development

  • Ollama: Fully implemented and tested
  • 📝 llama.cpp: Stub implementation (contributions welcome)
  • 📝 vLLM: Stub implementation (contributions welcome)
  • 📝 Text Generation WebUI: Stub implementation (contributions welcome)
  • 📝 LocalAI: Stub implementation (contributions welcome)
  • 📝 LM Studio: Stub implementation (contributions welcome)

🚀 Quick Start (Ollama)

from agents import LocalOllamaClient, ChatAgent

# Create client
client = LocalOllamaClient(
    model_name="llama3:latest",
    api_base="http://localhost:11434"
)

# Create agent
agent = ChatAgent(client)

# Get response
response = agent.get_full_response([
    {"role": "user", "content": "What is Python?"}
])
print(response)

📦 Installation

cd agents
pip install -e .

# With distributed support (SOLLOL)
pip install -e ".[distributed]"

✨ Features

Minimal by Design:

  • Zero Bloat: Direct API calls with clean abstractions
  • Minimal Dependencies: Only aiohttp and numpy (core)
  • Simple Architecture: Easy to understand and extend
  • No Magic: Explicit, straightforward code

Core Capabilities:

  • Provider Abstraction: Common interface for all local LLM deployments
  • Built-in Agents: Pre-configured agents for common tasks (optional)
  • Streaming Support: Native async streaming
  • Distributed Mode: Optional SOLLOL integration for Ollama clusters
  • Type-Safe: Full type hints throughout

🏗️ Architecture

Base Client Interface

All providers implement BaseOllamaClient:

class BaseOllamaClient(abc.ABC):
    @abc.abstractmethod
    async def generate_embedding(text: str) -> List[float]:
        ...

    @abc.abstractmethod
    async def chat(messages, **options) -> AsyncGenerator[ChatResponse]:
        ...

Implemented Providers

Ollama (Fully Implemented)

from agents import LocalOllamaClient, DistributedOllamaClient

# Single node
client = LocalOllamaClient(
    model_name="llama3:latest",
    api_base="http://localhost:11434"
)

# Distributed (with SOLLOL)
client = DistributedOllamaClient(
    model_name="llama3:latest",
    nodes=["http://node1:11434", "http://node2:11434"]
)

Stub Providers (Ready for Implementation)

See agents/providers.py for stub implementations:

from agents import (
    LlamaCppClient,      # llama.cpp server
    VLLMClient,          # vLLM deployment
    TextGenWebUIClient,  # Oobabooga
    LocalAIClient,       # LocalAI
    LMStudioClient,      # LM Studio
)

Note: These will raise NotImplementedError until implemented. Contributions welcome!

🤖 Built-in Agents

  • ChatAgent: Basic conversational agent
  • CodingAgent: Code generation specialist
  • ReasoningAgent: Analytical tasks
  • ResearchAgent: Information synthesis
  • SummarizationAgent: Text summarization
  • EmbeddingAgent: Generate embeddings

Custom Agents

from agents import BaseAgent

class SQLAgent(BaseAgent):
    system_prompt = "You are an expert SQL developer..."

agent = SQLAgent(client)

📖 Documentation

🧪 Examples

# Test framework
python test_as_library.py

# Example application
python example_app.py

# Individual examples
python agents/examples/basic_chat.py
python agents/examples/coding_agent.py
python agents/examples/embeddings.py
python agents/examples/multi_agent_workflow.py

🎯 Use Cases

Web Applications

from flask import Flask, jsonify, request
from agents import LocalOllamaClient, ChatAgent

app = Flask(__name__)
client = LocalOllamaClient("llama3:latest", "http://localhost:11434")
agent = ChatAgent(client)

@app.route('/chat', methods=['POST'])
def chat():
    message = request.json['message']
    response = agent.get_full_response([
        {"role": "user", "content": message}
    ])
    return jsonify({"response": response})

Command-Line Tools

from agents import LocalOllamaClient, ChatAgent
import sys

client = LocalOllamaClient("llama3:latest", "http://localhost:11434")
agent = ChatAgent(client)

query = " ".join(sys.argv[1:])
print(agent.get_full_response([{"role": "user", "content": query}]))

Background Workers

from celery import Celery
from agents import LocalOllamaClient, ResearchAgent

app = Celery('tasks', broker='redis://localhost:6379')
client = LocalOllamaClient("llama3:latest", "http://localhost:11434")
agent = ResearchAgent(client)

@app.task
def research_task(topic):
    return agent.get_full_response([
        {"role": "user", "content": f"Research: {topic}"}
    ])

🔧 Contributing

We welcome contributions, especially for implementing new LLM providers!

Implementing a New Provider

  1. See agents/providers.py for stub implementations
  2. Implement BaseOllamaClient interface:
    • async def generate_embedding(text: str) -> List[float]
    • async def chat(messages, **options) -> AsyncGenerator[ChatResponse]
  3. Follow the pattern in LocalOllamaClient (see agents/ollama_framework.py)
  4. Add tests and examples
  5. Submit a pull request

Priority Providers

  • llama.cpp - Lightweight C++ implementation
  • vLLM - High-throughput serving
  • LocalAI - OpenAI drop-in replacement
  • Text Generation WebUI - Popular gradio interface
  • LM Studio - User-friendly desktop app

⚠️ Important Notes

Experimental Status

This is an experimental framework in early development:

  • APIs may change without notice
  • Not recommended for production use yet
  • Limited testing and documentation
  • Breaking changes possible

Current Limitations

  • Only Ollama is fully implemented
  • Other providers are stubs requiring implementation
  • Limited error handling in some edge cases
  • Documentation may be incomplete

Roadmap

  • Implement llama.cpp client
  • Implement vLLM client
  • Implement LocalAI client
  • Add comprehensive testing suite
  • Add more utility functions
  • Improve error handling
  • Add caching layer
  • Support multi-modal (images, audio)
  • Function calling support

📁 Project Structure

OllamaAgent/
├── agents/                      # Main package
│   ├── __init__.py              # Public API
│   ├── ollama_framework.py      # Ollama clients (implemented)
│   ├── providers.py             # Other provider stubs
│   ├── agents.py                # Agent classes
│   ├── utils.py                 # Utility functions
│   └── examples/                # Usage examples
├── setup.py                     # Package installation
├── requirements.txt             # Dependencies
├── example_app.py               # Full application example
├── test_as_library.py           # Library usage tests
├── README.md                    # This file
├── QUICKSTART.md                # Quick start guide
├── INTEGRATION_GUIDE.md         # Integration patterns
└── ARCHITECTURE.md              # Design documentation

🧩 Dependencies

Core (minimal):

  • aiohttp>=3.8.0 - Async HTTP client
  • numpy>=1.20.0 - Vector operations

Optional:

  • sollol>=0.1.0 - Distributed Ollama load balancing

📄 License

MIT License - See LICENSE file for details

🤝 Support & Community

  • Issues: Report bugs or request features on GitHub
  • Examples: Check agents/examples/ for working code
  • Documentation: Read guides in QUICKSTART.md and INTEGRATION_GUIDE.md
  • Contributing: See contributing guidelines above

🎓 Related Projects

Built with patterns from:

⚡ Performance

Ollama Implementation:

  • Direct HTTP calls (~1-5ms overhead)
  • Connection pooling via aiohttp
  • Streaming support (no buffering)
  • Optional distributed mode with SOLLOL

🔒 Security

Local Deployment Focus:

  • No API keys required
  • All models run locally
  • Full control over data
  • No external API calls (except optional SOLLOL for Ollama clusters)

Status: Experimental Minimal Framework | Philosophy: Zero bloat, maximum simplicity | Primary Use: Ollama deployments | Looking for: Contributors to implement other providers!

Made for the local LLM community ❤️ | Keep it minimal, keep it simple.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_agents-0.1.0.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_agents-0.1.0-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file local_agents-0.1.0.tar.gz.

File metadata

  • Download URL: local_agents-0.1.0.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for local_agents-0.1.0.tar.gz
Algorithm Hash digest
SHA256 77a643318fc7c11fa461b66922610567e0b095dc117f95c11148ca3dcc8dbf1e
MD5 8682f5262a0eb8024a8d176611143757
BLAKE2b-256 0c51b88900152ae6af595adaa81f703384d36ac9adf08a078742d7f7c20c0ef7

See more details on using hashes here.

File details

Details for the file local_agents-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: local_agents-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for local_agents-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 441b3028081f7f98f7667753c9b3c5e10fdc1a19e8c1a41fb822303928e28643
MD5 897be45b0b4830676a562f89f71eae7f
BLAKE2b-256 1e90b623ea45020bcd278308284b5935eb5965a48f7e0770afb38585786b02d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page