Skip to main content

Official Python SDK for MBS Workbench — OpenAI-compatible local AI inference

Project description

mbs-python — Python SDK for MBS Workbench

Official async Python client for MBS Workbench. Connects to a running mbsd daemon or any OpenAI-compatible server.

Installation

pip install mbs-python
# with NumPy support:
pip install "mbs-python[numpy]"
# with Pandas support:
pip install "mbs-python[pandas]"
# everything:
pip install "mbs-python[all]"

Quick Start

import asyncio
from mbs import MbsClient

async def main():
    async with MbsClient(base_url="http://127.0.0.1:3030") as client:
        # Chat completion
        resp = await client.chat([
            {"role": "user", "content": "Explain Rust ownership in 2 sentences."}
        ])
        print(resp.choices[0].message.content)

        # Streaming
        async for delta in await client.chat_stream([
            {"role": "user", "content": "Write a haiku about async."}
        ]):
            print(delta, end="", flush=True)
        print()

        # Embeddings
        resp = await client.embed("Hello, world!")
        print(len(resp.data[0].embedding))  # vector dimension

asyncio.run(main())

API Reference

MbsClient

client = MbsClient(
    base_url="http://127.0.0.1:3030",  # default
    api_key=None,                        # optional bearer token
    timeout=120.0,                       # seconds
    max_retries=3,                       # on 429/5xx
    retry_base_delay_ms=500,
)
Method Description
await client.models() List available models → ModelsResponse
await client.load_model(req) Load a .gguf into VRAM → ModelLoadResponse
await client.unload_model() Unload current model → ModelUnloadResponse
await client.chat(messages, *, model, temperature, max_tokens, stop) Chat completion → ChatCompletionResponse
await client.chat_stream(messages, ...) Async iterator yielding text deltas
await client.complete(prompt, ...) Text completion → CompletionResponse
await client.embed(input, *, model) Embeddings → EmbeddingResponse
await client.embed_numpy(input, *, model) Embeddings as NumPy array (requires [numpy])
await client.generate_image(prompt, *, n, size) Image generation → ImageGenerationResponse
await client.run_agent(task, *, model, max_iterations) ReAct agent → AgentRunResponse
await client.list_tools() List MCP tools → McpToolsResponse
await client.invoke_tool(tool_id, arguments) Invoke MCP tool → McpInvokeResponse
await client.anthropic_messages(req) Anthropic-compatible → AnthropicMessagesResponse
await client.batch_chat(requests, *, concurrency) Batch chat → BatchSummary
await client.batch_embed(inputs, *, model, concurrency) Batch embeddings → BatchSummary
await client.batch_embed_dataframe(series, ...) Add embedding column to Pandas Series (requires [pandas])
await client.ping() Health check → bool

Streaming

async for delta in await client.chat_stream([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user",   "content": "Tell me about Python 3.12."},
]):
    print(delta, end="", flush=True)

Batch Processing

from mbs import ChatCompletionRequest, ChatMessage

requests = [
    ChatCompletionRequest(messages=[ChatMessage(role="user", content=f"What is {i}^2?")])
    for i in range(10)
]

summary = await client.batch_chat(requests, concurrency=5)
print(f"{summary.succeeded}/{len(summary.results)} succeeded")

for item in summary.results:
    if item.ok:
        data = item.value  # dict from ChatCompletionResponse.model_dump()
        print(data["choices"][0]["message"]["content"])
    else:
        print(f"Error: {item.error}")

NumPy / Pandas Integration

import numpy as np
import pandas as pd

# Get embeddings as a numpy array — shape (n, dim)
vectors = await client.embed_numpy(["hello", "world", "rust"])
print(vectors.shape)  # e.g. (3, 4096)

# Add embedding column to a DataFrame
df = pd.DataFrame({"text": ["foo", "bar", "baz"]})
df["embedding"] = await client.batch_embed_dataframe(
    df["text"], concurrency=3
)

Error Handling

from mbs import MbsError

try:
    resp = await client.chat([{"role": "user", "content": "Hi"}])
except MbsError as e:
    print(f"HTTP {e.status_code}: {e}")

Transient errors (429, 500–504) are automatically retried with exponential backoff + jitter.

Context Manager

async with MbsClient() as client:
    print(await client.ping())
# connection pool is closed automatically

Running Tests

cd sdk/python
pip install -e ".[dev]"
pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mbs_python-0.2.4.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mbs_python-0.2.4-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file mbs_python-0.2.4.tar.gz.

File metadata

  • Download URL: mbs_python-0.2.4.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mbs_python-0.2.4.tar.gz
Algorithm Hash digest
SHA256 16e8dd451ca86e8f0f4376f8a0990c34d0990acaffbaf0bd452416f6d13bb6c1
MD5 95d744ac576af0a3780689fdcc2ee452
BLAKE2b-256 09a2accaab82278c0c619cdfa318c2112957e23f64e474f75eb3780063f26d52

See more details on using hashes here.

File details

Details for the file mbs_python-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: mbs_python-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mbs_python-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4da14f5636bdc84048d5a6c95c5d4d59e53ef05031105d351973c7a8802a9afc
MD5 ded9c1de8edd4febd12e7d0ecd7e6b07
BLAKE2b-256 5c47d90613025a65a4cea11fb63a53c37658bed9da3dfd281e321ff5c09a79ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page