A lightweight and developer-friendly library for building and orchestrating AI agents
Project description
SlimAgents
A lightweight and developer-friendly library for building and orchestrating AI agents.
SlimAgents wraps any LLM (via LiteLLM) with a simple Agent class that handles tool calling, streaming, structured outputs, multi-modal inputs, and agent handoffs — all in under 1200 lines of code.
Install
Requires Python 3.10+
pip install slimagents
Or install the latest development version:
pip install git+https://github.com/aremeis/slimagents.git
Quick start
from slimagents import Agent
def calculator(expression: str) -> str:
"""Evaluate a Python expression."""
return str(eval(expression))
agent = Agent(
instructions="You are a helpful assistant. Use the calculator tool for math.",
tools=[calculator],
)
value = agent.apply("What is 1234 * 5678?")
print(value) # "1234 * 5678 = 7,006,652."
apply() is a synchronous convenience method that returns just the response value. For async code, call the agent directly:
value = await agent("What is 1234 * 5678?")
Tools
A tool is just a Python function. The function name, docstring, and type annotations are automatically converted to the LLM's tool schema — no decorators or registration needed.
def get_weather(city: str, unit: str = "celsius") -> str:
"""Get the current weather for a city."""
return f"22 degrees {unit} in {city}"
agent = Agent(tools=[get_weather])
Async tools
Both sync and async tools are supported. When the LLM generates multiple tool calls, async tools run concurrently:
import httpx
async def fetch_url(url: str) -> str:
"""Fetch the content of a URL."""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text
agent = Agent(tools=[fetch_url])
Tools as methods
Tools can be methods on an Agent subclass, which allows you to encapsulate state and logic:
import python_weather
from slimagents import Agent
class WeatherAgent(Agent):
def __init__(self):
super().__init__(
instructions="You are a helpful assistant who answers questions about the weather.",
tools=[self.get_temperature],
)
async def get_temperature(self, location: str) -> float:
"""Get the current temperature in a given location, in degrees Celsius."""
async with python_weather.Client(unit=python_weather.METRIC) as client:
weather = await client.get(location)
return weather.temperature
agent = WeatherAgent()
value = agent.apply("What is the temperature difference between London and Paris?")
print(value)
The temperature difference between London and Paris is 1°C, with London being warmer.
Since get_temperature is async, both calls run in parallel when the LLM requests them simultaneously.
LLM support
SlimAgents uses LiteLLM under the hood, so you can use virtually any LLM. The default model is gpt-4.1. Specify any model string that LiteLLM supports:
# OpenAI
agent = Agent(model="gpt-4.1-mini")
# Anthropic
agent = Agent(model="anthropic/claude-sonnet-4-20250514")
# Google Gemini
agent = Agent(model="gemini/gemini-2.5-flash")
# Azure, AWS Bedrock, Ollama, etc. — see LiteLLM docs
Any extra keyword arguments are passed through to LiteLLM:
agent = Agent(model="gpt-4.1", api_key="sk-...", base_url="https://my-proxy.com")
LiteLLM parameters
All LiteLLM-specific parameters are supported via keyword arguments:
# Retry transient errors (429, 500+) with exponential backoff
agent = Agent(num_retries=3)
# Fallback to a different model on context window errors
agent = Agent(
model="gpt-4.1",
context_window_fallback_dict={"gpt-4.1": "gpt-4.1-mini"},
)
# Model fallbacks on any failure
agent = Agent(model="gpt-4.1", fallbacks=["anthropic/claude-sonnet-4-20250514"])
Instructions
Instructions become the system message. They can be a string or a callable for dynamic instructions:
# Static instructions
agent = Agent(instructions="You are a helpful assistant.")
# Dynamic instructions via callable
agent = Agent(instructions=lambda: f"Today's date is {date.today()}")
You can also override the instructions property in a subclass for full control:
class StrictAgent(Agent):
def __init__(self, max_responses: int):
super().__init__(tools=[self.decrement])
self._answers_left = max_responses
@property
def instructions(self) -> str:
if self._answers_left > 0:
return f"You have {self._answers_left} responses left. Call `decrement` before each response."
return "You always answer 'I can't answer that.'."
def decrement(self):
"""Call this before every response."""
self._answers_left -= 1
return "OK"
Memory
Memory is a list of message dicts in OpenAI chat format. There are two levels:
- Default memory (
agent.memory): always included in every call, set at construction or via the property. - Per-call memory: passed to
run()/apply()and tracks the conversation for that call.
agent = Agent(instructions="You are a helpful assistant.")
# Maintain a conversation across multiple calls
memory = []
agent.apply("My name is Alice.", memory=memory)
value = agent.apply("What's my name?", memory=memory)
print(value) # "Your name is Alice."
Use memory_delta to capture only the new messages added during a call:
delta = []
agent.apply("Hello!", memory=memory, memory_delta=delta)
print(len(delta)) # Number of new messages (user message + assistant response + any tool calls)
Handoffs
A tool can return an Agent instance to transfer control to a different agent. The new agent inherits the conversation memory:
sales_agent = Agent(
name="Sales Agent",
instructions="You are a sales agent. Help the customer with purchases.",
)
support_agent = Agent(
name="Support Agent",
instructions="You are a support agent. Help with technical issues.",
)
def transfer_to_sales():
"""Transfer the customer to the sales team."""
return sales_agent
def transfer_to_support():
"""Transfer the customer to the support team."""
return support_agent
triage = Agent(
name="Triage",
instructions="Route the customer to the right department.",
tools=[transfer_to_sales, transfer_to_support],
)
response = triage.run_sync("I want to buy a new laptop.")
print(response.agent.name) # "Sales Agent"
Nested agent calls (non-handoff)
If you want an agent to process a sub-task and return the result as a tool output (without transferring control), use ToolResult:
from slimagents import ToolResult
researcher = Agent(instructions="You are a research assistant.")
def research(topic: str):
"""Research a topic using a specialized agent."""
return ToolResult(agent=researcher, handoff=False)
agent = Agent(tools=[research])
# The researcher processes the topic, and its response becomes the tool result.
# Control stays with `agent`.
Structured outputs
Use response_format to get typed responses instead of plain strings.
Pydantic models
from pydantic import BaseModel
from slimagents import Agent
class MovieReview(BaseModel):
title: str
rating: float
summary: str
agent = Agent[MovieReview](
instructions="You are a movie critic.",
response_format=MovieReview,
)
review = agent.apply("Review The Matrix")
print(review.title) # "The Matrix"
print(review.rating) # 9.0
print(review.summary) # "A groundbreaking sci-fi film..."
JSON mode
Pass dict to get a parsed JSON dictionary:
agent = Agent[dict](response_format=dict)
data = agent.apply("Return a JSON object with fields: name, age")
print(data["name"]) # str
Primitive types
You can also use int, float, bool, or list as the response format:
agent = Agent[int](response_format=int)
count = agent.apply("How many continents are there?")
print(count) # 7 (int, not str)
Multi-modal inputs
Pass file-like objects, bytes, FileContent, or URLs alongside text. The agent handles base64 encoding and MIME type detection automatically:
from slimagents import Agent
agent = Agent(
model="gemini/gemini-2.0-flash",
instructions="Describe the contents of the provided files.",
)
# File object
with open("photo.jpg", "rb") as f:
description = agent.apply("What's in this image?", f)
# Multiple inputs
with open("report.pdf", "rb") as pdf:
summary = agent.apply("Summarize this document", pdf)
For programmatic file content, use FileContent:
from slimagents.core import FileContent
content = FileContent(
content=image_bytes,
filename="chart.png",
mime_type="image/png",
)
description = agent.apply("Describe this chart", content)
Streaming
Enable streaming to receive tokens as they arrive:
response = await agent.run("Tell me a story", stream=True)
async for chunk in response:
if isinstance(chunk, str):
print(chunk, end="", flush=True)
Fine-tune what gets streamed:
response = await agent.run(
"Tell me a story",
stream=True,
stream_tokens=True, # Yield individual tokens as strings (default: True)
stream_delimiters=True, # Yield MessageDelimiter events for message boundaries
stream_tool_calls=True, # Yield tool call deltas as they arrive
stream_response=True, # Yield the final Response object at the end of the stream
)
When stream_response=True, the final item in the stream is a Response object:
from slimagents import Response
async for chunk in response:
if isinstance(chunk, Response):
print(f"\nTokens used: {chunk.metadata.total_tokens}")
elif isinstance(chunk, str):
print(chunk, end="")
The Response object
run() and run_sync() return a Response[T] with:
response = agent.run_sync("Hello!")
response.value # The response content (str, dict, or BaseModel depending on response_format)
response.memory_delta # List of messages added during this call
response.agent # The agent that produced the response (may differ from original if handoff occurred)
response.metadata # ResponseMetadata with token counts and cost
ResponseMetadata tracks usage across all turns:
meta = response.metadata
meta.input_tokens # Total input tokens
meta.output_tokens # Total output tokens
meta.total_tokens # Total tokens
meta.cost # Total cost (USD)
Interactive CLI
Use run_demo_loop to quickly test an agent in your terminal:
from slimagents import Agent, run_demo_loop
agent = Agent(instructions="You are a helpful assistant.")
run_demo_loop(agent, stream=True)
Starting SlimAgents CLI 🪶
User: Hello!
Agent: Hi there! How can I help you today?
User:
Logging
SlimAgents uses Python's standard logging module:
import logging
from slimagents import logger
logging.basicConfig(level=logging.INFO)
logger.setLevel(logging.DEBUG) # Verbose agent logs
Origin
SlimAgents started as a fork of OpenAI's Swarm framework. Major differences:
- Works with any LLM (not just OpenAI)
- Designed for subclassing
Agentto encapsulate behavior - Async-native with concurrent tool execution
- Multi-modal input support
- Structured outputs with Pydantic
- Proper Python logging
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file slimagents-0.7.2.tar.gz.
File metadata
- Download URL: slimagents-0.7.2.tar.gz
- Upload date:
- Size: 32.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8ed066d7b70e711868b298c1019d5bedabf1a444cbccc714a0fbf140c933577
|
|
| MD5 |
c53c379b12a8c2678cd0a6c12a83136d
|
|
| BLAKE2b-256 |
7b41cd2ff3e98553dd68129afbe91a637632ec6a644b6f52a2edeb0caa729a1e
|
File details
Details for the file slimagents-0.7.2-py3-none-any.whl.
File metadata
- Download URL: slimagents-0.7.2-py3-none-any.whl
- Upload date:
- Size: 39.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d639083518a5287ddfafb992426c029178b72f2ae74f7fffda8e24973cf2915d
|
|
| MD5 |
3c2d371cb7e0030a8a0361c8becdfc6e
|
|
| BLAKE2b-256 |
f98839bdeb7e25e83fb0ce4b51c79c6c9922d0be3c40a2040c94a74081813b53
|