A Python wrapper for OpenAI-compatible APIs with cost tracking and async/sync support
Project description
ModelStash
A lightweight Python library for managing and invoking multiple AI models with built-in cost tracking, token counting, and full chat history support.
Status: Maintenance Mode - No new features planned.
Features
- Multi-model management - Register and switch between multiple AI models via
ModelContainer - Sync & async support - Use
invoke()for synchronous calls orainvoke()for async - Chat history - Send multi-turn conversations with system prompts and message history
- Stateful sessions - Use context manager sessions (
chat()/achat()) for automatic history tracking - Vision support - Pass images to models that support multimodal inputs, with per-message image support
- Cost tracking - Automatic token counting and cost calculation per request
- OpenRouter compatible - Works with any OpenAI-compatible API endpoint
Installation
pip install ModelStash
Quick Start
from ModelStash import ModelContainer
container = ModelContainer(api_key="your-api-key")
container.add(
name="flash",
model_name="google/gemini-2.0-flash-001",
input_cost=0.0,
output_cost=0.0,
)
result = container.flash.invoke("Hello, world!")
print(result.content)
print(f"Cost: ${result.metadata.cost:.6f}")
API Reference
ModelContainer
Manages a collection of models and their HTTP clients.
container = ModelContainer(api_key="...", base_url="https://openrouter.ai/api/v1")
| Method | Description |
|---|---|
add(name, model_name, input_cost, output_cost, temperature=0) |
Register a new model |
get(model_name) |
Get a model by name (via __getattr__) |
Model
Represents a single model configuration.
model = container.add("name", "model-id", input_cost=0.0, output_cost=0.0)
| Method | Description |
|---|---|
invoke(prompt) |
Synchronous call with a string or list of messages |
ainvoke(prompt) |
Async call with a string or list of messages |
calculate_cost(input_tokens, output_tokens) |
Calculate cost for tokens |
chat(initial_messages=None) |
Start a synchronous chat session (context manager) |
achat(initial_messages=None) |
Start an async chat session (context manager) |
Message
Returned by model invocations.
@dataclass
class Message:
content: str # Model's response text
metadata: Metadata # Token usage and cost info
Metadata
Token usage and cost data.
@dataclass
class Metadata:
input_tokens: int # Prompt tokens used
output_tokens: int # Completion tokens used
cost: float # Total cost in USD
Role
Enum for message roles when using raw dicts.
class Role(str, Enum):
SYSTEM = "system"
USER = "user"
ASSISTANT = "assistant"
Typed Message Classes
Convenient dataclasses for building messages with IDE autocomplete.
Note: The
imagestype hint looks complex because it encodes both orderings:(bytes, mime_type)or(mime_type, bytes). In practice, just pass(image_bytes, ImageType.PNG)or("image/png", image_bytes)— both work.
@dataclass
class SystemMessage:
content: str
@dataclass
class UserMessage:
content: str
images: tuple[bytes, str | ImageType] | tuple[str | ImageType, bytes] | list[...] | None = None
@dataclass
class AssistantMessage:
content: str
images: tuple[bytes, str | ImageType] | tuple[str | ImageType, bytes] | list[...] | None = None
ImageType
Supported image MIME types:
ImageType.PNGImageType.JPEGImageType.JPGImageType.WEBPImageType.GIF
Examples
Simple Text Prompt
from ModelStash import ModelContainer
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
result = container.flash.invoke("What is the capital of France?")
print(result.content)
Multi-Message Conversation (Stateless)
Send a full conversation history in a single call. The model sees all messages but no state is retained.
from ModelStash import ModelContainer, Role, SystemMessage, UserMessage, AssistantMessage
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
# Using typed message classes
result = container.flash.invoke([
SystemMessage("You are a helpful coding assistant."),
UserMessage("How do I reverse a list in Python?"),
AssistantMessage("You can use list[::-1] or the reversed() function."),
UserMessage("Which is faster?"),
])
print(result.content)
# Using raw dicts with Role enum
result = container.flash.invoke([
{"role": Role.SYSTEM, "content": "You are a helpful coding assistant."},
{"role": Role.USER, "content": "How do I reverse a list in Python?"},
{"role": Role.ASSISTANT, "content": "You can use list[::-1] or the reversed() function."},
{"role": Role.USER, "content": "Which is faster?"},
])
With Image Input
Images are passed as (bytes, mime_type) or (mime_type, bytes) tuples. Order doesn't matter — the library detects which element is the bytes and which is the MIME type.
Note: The type hint looks complex because it encodes both orderings. In practice, just pass
(image_bytes, "image/png")or("image/png", image_bytes)— both work.
from ModelStash import ModelContainer, ImageType
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
with open("image.png", "rb") as f:
image_bytes = f.read()
# Single image
result = container.flash.invoke([
{"role": Role.USER, "content": "Describe this image", "images": [(image_bytes, ImageType.PNG)]},
])
# Or using typed classes
from ModelStash import UserMessage
result = container.flash.invoke([
UserMessage("Describe this image", images=(image_bytes, ImageType.PNG)),
])
# Multiple images with mixed types
with open("photo.jpg", "rb") as f:
jpg_bytes = f.read()
result = container.flash.invoke([
UserMessage("Compare these images", images=[
(image_bytes, ImageType.PNG),
(jpg_bytes, "image/jpeg"), # Can use raw strings too
]),
])
Async Usage
import asyncio
from ModelStash import ModelContainer
async def main():
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
result = await container.flash.ainvoke("What is this?")
print(result.content)
asyncio.run(main())
Stateful Chat Session
Use chat() (sync) or achat() (async) for automatic history tracking. The session maintains the full conversation and appends messages only after successful API calls.
from ModelStash import ModelContainer, SystemMessage
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
with container.flash.chat([SystemMessage("You are a pirate.")]) as session:
msg1 = session.send("Hello!")
print(msg1.content) # "Ahoy matey!"
msg2 = session.send("What's the weather like?")
print(msg2.content)
# Access full conversation history
print(session.history)
# History is fully editable
session.history.pop() # Remove last assistant message
import asyncio
from ModelStash import ModelContainer
async def main():
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
async with container.flash.achat() as session:
msg = await session.send("Tell me a joke")
print(msg.content)
asyncio.run(main())
Session with Images
from ModelStash import ModelContainer, ImageType
container = ModelContainer(api_key="...")
container.add("flash", "google/gemini-2.0-flash-001", 0.0, 0.0)
with open("image.png", "rb") as f:
image_bytes = f.read()
with container.flash.chat() as session:
msg = session.send("What's in this image?", images=(image_bytes, ImageType.PNG))
print(msg.content)
msg = session.send("What color is the main object?")
print(msg.content)
License
GPL-3.0-only
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modelstash-2.1.0.tar.gz.
File metadata
- Download URL: modelstash-2.1.0.tar.gz
- Upload date:
- Size: 61.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6eb6ef906c199af14223bdc72637220a348e357914620461c203fd52d48f532f
|
|
| MD5 |
9a3ac5ed662e0a255a4041fa321bdc4d
|
|
| BLAKE2b-256 |
24a8715453fe67d32913f80998e196847a75f5ac86eb6778404d3cac67884478
|
File details
Details for the file modelstash-2.1.0-py3-none-any.whl.
File metadata
- Download URL: modelstash-2.1.0-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
decf53fa85e76fec29067636e382f7c29ddc48f048c45ceea494ff32d1866352
|
|
| MD5 |
719a2f5965de79b811429cd8dc1713b0
|
|
| BLAKE2b-256 |
57d01eecd0ef2d14d60c8946c31f32cee9afe18cd31172769ff43c7b99521d5f
|