Skip to main content

A slim, intuitive, lightweight Python library for calling LLMs (high-level + low-level) with multi-provider support.

Project description

SlimX (slimx) — v0.8.0

SlimX is a tiny, inspectable LLM runtime for building vendor-neutral AI software across cloud and local models.

It is designed around two clearly separated APIs:

  • High-level API (slimx) — “1-minute productivity”: llm(...), .stream(...), .json(...), tools, retries.
  • Low-level API (slimx.low) — “systems builder primitives”: explicit Client, ChatRequest, Message, provider registry, middleware.

SlimX supports multiple providers — OpenAI, Anthropic, Ollama, and Google Gemini — plus provider plugins for third-party providers without modifying core.

How it works: ARCHITECTURE.md is an annotated, diagram-driven tour of the runtime. How we build it: DEVELOPMENT.md is the engineering charter, Provider Contract, and roadmap.


Install

For users

Create a new project and install SlimX:

uv init my-project
cd my-project
uv add slimx

Run Python through uv so it uses the project virtual environment:

uv run python

Or install with pip:

pip install slimx

For contributors

git clone https://github.com/slimx-ai/slimx.git
cd slimx
uv sync --all-extras
uv run pytest -q

uv sync reads pyproject.toml and uv.lock when present.

uv.lock is committed to help contributors reproduce the development environment.


Supported providers

Provider Prefix Environment variable Notes
OpenAI openai: OPENAI_API_KEY Default provider when no prefix is given
OpenAI-compatible oai: OpenAI-compatible /v1/chat/completions API vLLM, LM Studio, llama.cpp server, LocalAI, Ollama /v1, internal gateways
Google Gemini google: GOOGLE_API_KEY or GEMINI_API_KEY Supports chat, streaming, JSON output, and tools
Anthropic anthropic: ANTHROPIC_API_KEY Claude Messages API; supports chat, JSON output, and tools
Ollama ollama: optional OLLAMA_BASE_URL Local models through Ollama

Inspect provider capabilities

Check what a provider supports before runtime — no API key or running server required:

from slimx.providers import describe_provider

describe_provider("google")
# {'name': 'google', 'native': True, 'tools': True, 'structured_output': True,
#  'streaming': True, 'async_chat': False, 'async_streaming': False}

from slimx import llm
llm("openai:gpt-4.1-nano").capabilities.tools  # True

Every provider is checked against a shared conformance suite (tests/conformance/), so declared capabilities always match real behavior. See docs: Provider Capabilities and docs: OpenAI-compatible servers.


Configure providers

OpenAI

export OPENAI_API_KEY="..."
# optional:
export OPENAI_BASE_URL="https://api.openai.com/v1"

OpenAI-compatible servers

Use oai: for local or self-hosted servers that expose an OpenAI-compatible /v1/chat/completions API.

export SLIMX_OAI_BASE_URL="http://localhost:8000/v1"
export SLIMX_OAI_API_KEY="EMPTY"

SLIMX_OAI_API_KEY can be a real key for authenticated gateways, or EMPTY for local servers that ignore authentication.

Google Gemini

export GOOGLE_API_KEY="..."
# or:
export GEMINI_API_KEY="..."

# optional:
export GOOGLE_BASE_URL="https://generativelanguage.googleapis.com/v1beta"

Anthropic

export ANTHROPIC_API_KEY="..."
# optional:
export ANTHROPIC_BASE_URL="https://api.anthropic.com"
export ANTHROPIC_VERSION="2023-06-01"

Ollama local models

export OLLAMA_BASE_URL="http://localhost:11434"

For Ollama, make sure the server is running and the model is available:

ollama serve

In another terminal:

ollama pull llama3.2:3b
ollama list

Quickstart

OpenAI

from slimx import llm

m = llm("openai:gpt-4.1-nano", temperature=0.2)
res = m("Write a haiku about fog and streetlights.")

print(res.text)

OpenAI-compatible local/self-hosted server

from slimx import llm

m = llm(
    "oai:Qwen/Qwen2.5-7B-Instruct",
    provider_kwargs={
        "base_url": "http://localhost:8000/v1",
        "api_key": "EMPTY",
    },
    timeout=120,
)

res = m("Explain why compatibility APIs are useful for local model serving.")

print(res.text)

Google Gemini

from slimx import llm

m = llm("google:gemini-3.5-flash", temperature=0.2)
res = m("Write a haiku about small, inspectable AI software.")

print(res.text)

Ollama local model

from slimx import llm

m = llm("ollama:llama3.2:3b", temperature=0.2, timeout=120)
res = m("Explain why small libraries are easier to inspect.")

print(res.text)

Response structure

Calling a SlimX model returns a Result object.

from slimx import llm

m = llm("ollama:llama3.2:3b", timeout=120)
res = m("Explain why small libraries are easier to inspect.")

print(res.text)

A Result contains:

Result(
    text="...",          # Normalized assistant text
    raw={...},           # Raw provider response
    usage=Usage(...),    # Token usage when available
    tool_calls=[],       # Tool/function calls requested by the model
    data=None,           # Parsed structured output, used by .json(...)
    trace={...},         # Runtime metadata: provider, model, latency, retries, tools
)

Most applications should use:

print(res.text)

Use res.raw when you need provider-specific details, and res.trace when you want runtime diagnostics such as provider name, model name, elapsed time, retries, and tool-call count.

Streaming

from slimx import llm

m = llm("google:gemini-3.5-flash", temperature=0.2)

for ev in m.stream("Tell a short story in 5 lines."):
    if ev.type == "text_delta":
        print(ev.text, end="", flush=True)

print()

Tools

SlimX tools are provider-neutral. The same @tool interface can be used across providers that support tool/function calling.

from slimx import llm, tool


@tool
def add(a: int, b: int) -> int:
    "Add two integers."
    return a + b


m = llm("google:gemini-3.5-flash", tools=[add], tool_runtime="auto")
res = m("What is 12 + 30?")

print(res.text)

Parallel execution

Fan one prompt out to several models at once with parallel(...). Use mode="all" to compare every answer, or mode="race" for the first successful response.

from slimx import parallel

ensemble = parallel(["google:gemini-3.5-flash", "openai:gpt-4.1-nano"])
res = ensemble("Explain SlimX in one paragraph.")

for item in res.results:
    print(item.model, item.result.text if item.ok else item.error)

Failures are surfaced in res.errors (never swallowed) and each result keeps its raw provider response. See docs: Parallel execution.


Structured output

SlimX can parse structured JSON output into a dataclass.

from dataclasses import dataclass

from slimx import llm


@dataclass
class City:
    name: str
    country: str


m = llm("google:gemini-3.5-flash")
res = m.json("Paris is in France.", schema=City)

print(res.data)

Low-level API

Use the low-level API when you want explicit control over messages, requests, clients, and providers.

from slimx import Message
from slimx.low import ChatRequest, Client
from slimx.providers import get_provider

provider = get_provider("google")
client = Client(provider, timeout=30, retries=2)

req = ChatRequest(
    model="gemini-3.5-flash",
    messages=[Message.user("Explain provider-neutral LLM clients in one paragraph.")],
    temperature=0.2,
)

res = client.chat(req)

print(res.text)
print(res.trace)

Provider plugins

SlimX supports third-party provider plugins through the slimx.providers entry point group.

Built-in providers are registered lazily, so importing slimx does not load provider modules or require API keys.


Troubleshooting

ModuleNotFoundError: No module named 'slimx'

If you installed with uv add slimx, run Python through uv:

uv run python

Or activate the virtual environment first:

source .venv/bin/activate
python

Ollama model not found

Check which models are installed:

ollama list

Pull a model before using it:

ollama pull llama3.2:3b

Then use the exact model name:

m = llm("ollama:llama3.2:3b", timeout=120)

Ollama server not running

Start Ollama:

ollama serve

Then retry your SlimX script.


Development

Run the full validation suite before opening a pull request or tagging a release:

uv sync --all-extras
uv run ruff check .
uv run pyright
uv run pytest -q
uv run python -m build

Repo automation

This repository includes GitHub Actions for:

  • CI (.github/workflows/ci.yml)
  • Docs deployment to GitHub Pages (docs.yml)

See docs/ for more detailed documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slimx-0.8.0.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slimx-0.8.0-py3-none-any.whl (42.7 kB view details)

Uploaded Python 3

File details

Details for the file slimx-0.8.0.tar.gz.

File metadata

  • Download URL: slimx-0.8.0.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for slimx-0.8.0.tar.gz
Algorithm Hash digest
SHA256 7f0e7285d24fc719295c1addc78ec740152cbe7d04d17fa097df1eacc86a5ef5
MD5 214afb8e5329b2a3485a6ed115725d20
BLAKE2b-256 a9fe1366cf8343ace4a95b598e469aab21b07512b1004b7e03f45e27c4228a50

See more details on using hashes here.

File details

Details for the file slimx-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: slimx-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 42.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for slimx-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ffdf935294f2e41310b9f25608e6a7de682cac860fd3ad711c4b33a43dd61671
MD5 d9f47a23d41ae31d6d49f5820c818ed4
BLAKE2b-256 d38460d06247c46e62e0b0860b42bd4860133ab4426a1b4552fdc75248f03329

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page