Official Python SDK for MBS Workbench — OpenAI-compatible local AI inference
Project description
mbs-python — Python SDK for MBS Workbench
Official async Python client for MBS Workbench.
Connects to a running mbsd daemon or any OpenAI-compatible server.
Installation
pip install mbs-python
# with NumPy support:
pip install "mbs-python[numpy]"
# with Pandas support:
pip install "mbs-python[pandas]"
# everything:
pip install "mbs-python[all]"
Quick Start
import asyncio
from mbs import MbsClient
async def main():
async with MbsClient(base_url="http://127.0.0.1:3030") as client:
# Chat completion
resp = await client.chat([
{"role": "user", "content": "Explain Rust ownership in 2 sentences."}
])
print(resp.choices[0].message.content)
# Streaming
async for delta in await client.chat_stream([
{"role": "user", "content": "Write a haiku about async."}
]):
print(delta, end="", flush=True)
print()
# Embeddings
resp = await client.embed("Hello, world!")
print(len(resp.data[0].embedding)) # vector dimension
asyncio.run(main())
API Reference
MbsClient
client = MbsClient(
base_url="http://127.0.0.1:3030", # default
api_key=None, # optional bearer token
timeout=120.0, # seconds
max_retries=3, # on 429/5xx
retry_base_delay_ms=500,
)
| Method | Description |
|---|---|
await client.models() |
List available models → ModelsResponse |
await client.load_model(req) |
Load a .gguf into VRAM → ModelLoadResponse |
await client.unload_model() |
Unload current model → ModelUnloadResponse |
await client.chat(messages, *, model, temperature, max_tokens, stop) |
Chat completion → ChatCompletionResponse |
await client.chat_stream(messages, ...) |
Async iterator yielding text deltas |
await client.complete(prompt, ...) |
Text completion → CompletionResponse |
await client.embed(input, *, model) |
Embeddings → EmbeddingResponse |
await client.embed_numpy(input, *, model) |
Embeddings as NumPy array (requires [numpy]) |
await client.generate_image(prompt, *, n, size) |
Image generation → ImageGenerationResponse |
await client.run_agent(task, *, model, max_iterations) |
ReAct agent → AgentRunResponse |
await client.list_tools() |
List MCP tools → McpToolsResponse |
await client.invoke_tool(tool_id, arguments) |
Invoke MCP tool → McpInvokeResponse |
await client.anthropic_messages(req) |
Anthropic-compatible → AnthropicMessagesResponse |
await client.batch_chat(requests, *, concurrency) |
Batch chat → BatchSummary |
await client.batch_embed(inputs, *, model, concurrency) |
Batch embeddings → BatchSummary |
await client.batch_embed_dataframe(series, ...) |
Add embedding column to Pandas Series (requires [pandas]) |
await client.ping() |
Health check → bool |
Streaming
async for delta in await client.chat_stream([
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me about Python 3.12."},
]):
print(delta, end="", flush=True)
Batch Processing
from mbs import ChatCompletionRequest, ChatMessage
requests = [
ChatCompletionRequest(messages=[ChatMessage(role="user", content=f"What is {i}^2?")])
for i in range(10)
]
summary = await client.batch_chat(requests, concurrency=5)
print(f"{summary.succeeded}/{len(summary.results)} succeeded")
for item in summary.results:
if item.ok:
data = item.value # dict from ChatCompletionResponse.model_dump()
print(data["choices"][0]["message"]["content"])
else:
print(f"Error: {item.error}")
NumPy / Pandas Integration
import numpy as np
import pandas as pd
# Get embeddings as a numpy array — shape (n, dim)
vectors = await client.embed_numpy(["hello", "world", "rust"])
print(vectors.shape) # e.g. (3, 4096)
# Add embedding column to a DataFrame
df = pd.DataFrame({"text": ["foo", "bar", "baz"]})
df["embedding"] = await client.batch_embed_dataframe(
df["text"], concurrency=3
)
Error Handling
from mbs import MbsError
try:
resp = await client.chat([{"role": "user", "content": "Hi"}])
except MbsError as e:
print(f"HTTP {e.status_code}: {e}")
Transient errors (429, 500–504) are automatically retried with exponential backoff + jitter.
Context Manager
async with MbsClient() as client:
print(await client.ping())
# connection pool is closed automatically
Running Tests
cd sdk/python
pip install -e ".[dev]"
pytest tests/ -v
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mbs_python-0.2.4.tar.gz
(14.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mbs_python-0.2.4.tar.gz.
File metadata
- Download URL: mbs_python-0.2.4.tar.gz
- Upload date:
- Size: 14.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16e8dd451ca86e8f0f4376f8a0990c34d0990acaffbaf0bd452416f6d13bb6c1
|
|
| MD5 |
95d744ac576af0a3780689fdcc2ee452
|
|
| BLAKE2b-256 |
09a2accaab82278c0c619cdfa318c2112957e23f64e474f75eb3780063f26d52
|
File details
Details for the file mbs_python-0.2.4-py3-none-any.whl.
File metadata
- Download URL: mbs_python-0.2.4-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4da14f5636bdc84048d5a6c95c5d4d59e53ef05031105d351973c7a8802a9afc
|
|
| MD5 |
ded9c1de8edd4febd12e7d0ecd7e6b07
|
|
| BLAKE2b-256 |
5c47d90613025a65a4cea11fb63a53c37658bed9da3dfd281e321ff5c09a79ab
|