Local browser-backed LLM and multimodal inference bridge

These details have not been verified by PyPI

Project links

Project description

xlocllm

xlocllm is a Python SDK for local browser-backed AI inference. It exposes an OpenAI-compatible loopback API from Python while running model weights in a paired browser window through WebGPU/WebNN with MLC WebLLM and Transformers.js.

The goal is simple:

pip install xlocllm

Then:

import xlocllm

llm = xlocllm.unit("LLM", "Qwen-3.5-0.8b")
runtime = xlocllm.runtime([llm])
runtime.run()

print(runtime.url)  # http://127.0.0.1:1146/v1
print(runtime.chat("Say hello", temperature=0))

What It Does

Starts a local FastAPI bridge on 127.0.0.1.
Opens a paired browser app window.
Runs models inside the browser runtime.
Provides OpenAI-compatible /v1 endpoints for local clients.
Supports LLMs, embeddings, rerankers, translation, TTS, vision, ASR, and more through a shared catalog.
Keeps Python-side objects for models, units, runtimes, and bridges.

Install

pip install xlocllm

Optional OpenAI client helper:

pip install "xlocllm[openai]"

Development install from this repository:

python -m pip install -e .\python\xlocllm[dev,openai]

Quick Start

import xlocllm

runtime = xlocllm.runtime(
    [
        xlocllm.unit("LLM", "Qwen-3.5-0.8b"),
        xlocllm.unit("embedding", "multilingual-e5-small"),
    ]
)

runtime.install()
runtime.run()

print(runtime.status())

OpenAI-Compatible Usage

import xlocllm
from openai import OpenAI

llm = xlocllm.unit(type="LLM", model="Qwen-3.5-0.8b-fp32")
client = OpenAI(base_url="http://127.0.0.1:1146/v1", api_key="xlocllm")

with xlocllm.runtime([llm]) as runtime:
    runtime.run()
    response = client.chat.completions.create(
        model="Qwen-3.5-0.8b-fp32",
        messages=[{"role": "user", "content": "What is lidar?"}],
        max_tokens=64,
    )
    print(response.choices[0].message.content)

With the optional helper:

client = runtime.client()

Core API

model = xlocllm.model("Qwen-3.5-0.8b", unit="LLM")
models = xlocllm.models(unit="LLM", max_vram_mb=1500)
cpu_models = xlocllm.models(webgpu=False)

unit = xlocllm.unit("LLM", "Qwen-3.5-0.8b", reasoning=None)
runtime = xlocllm.runtime([unit], port=1146)
bridge = xlocllm.Bridge(port=1146)

print(runtime.url)
print(bridge.url)
print(xlocllm.bridges())
print(xlocllm.runtimes())
print(xlocllm.status())
print(xlocllm.benchmark())
print(xlocllm.benchmark("LLM"))

benchmark() temporarily opens a paired mini browser by default to detect real WebGPU/WebNN/NPU support, then closes it. With a unit type, it returns fast and quality recommendations.

Reasoning-capable LLMs can be configured at creation and updated hot:

llm = xlocllm.unit("LLM", "Qwen-3.5-0.8b-fp32", reasoning=False)
runtime.set_reasoning(llm.id, True)

CLI:

xlocllm status
xlocllm benchmark
xlocllm benchmark LLM
xlocllm models --unit LLM --no-webgpu
xlocllm run --unit LLM --model "Qwen-3.5-0.8b"

Documentation

Full English SDK docs: docs.md
Full Russian SDK docs: docs_ru.md
English model catalog: models.md
Russian model catalog: models_ru.md

Model Lookup

Use exact modelId, label, or aliases:

xlocllm.unit("LLM", "Qwen-3.5-0.8b")
xlocllm.unit("LLM", "Qwen3.5-0.8B-q4f16_1-MLC")
xlocllm.unit("embedding", "multilingual-e5-small")

Browse the complete catalog in models.md.

Local State

By default, xlocllm stores bridge metadata and browser profiles under:

Windows: %LOCALAPPDATA%\xlocllm
Linux/macOS: $XDG_STATE_HOME/xlocllm or ~/.local/state/xlocllm

Environment variables:

XLOCLLM_HOME - override local state directory.
XLOCLLM_WEB_URL - use a custom web runtime URL.
XLOCLLM_LOG_LEVEL - uvicorn log level.

Development Checks

python -m pytest python/xlocllm/tests
python -m ruff check python/xlocllm/src python/xlocllm/tests
python -m mypy python/xlocllm/src

Build the Python package:

cd python\xlocllm
python -m build

Notes

The bridge binds to loopback only. The browser window must remain open while browser-backed models are running. Without WebGPU, xlocllm exposes only the CPU/WASM-compatible Transformers.js subset and rejects heavier models before loading.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.0

May 28, 2026

This version

1.0.1

May 25, 2026

1.0.0

May 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlocllm-1.0.1.tar.gz (8.2 MB view details)

Uploaded May 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

xlocllm-1.0.1-py3-none-any.whl (8.3 MB view details)

Uploaded May 25, 2026 Python 3

File details

Details for the file xlocllm-1.0.1.tar.gz.

File metadata

Download URL: xlocllm-1.0.1.tar.gz
Upload date: May 25, 2026
Size: 8.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for xlocllm-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`a241cb7cd30d5e3e657972fb095f8c3f6b87136ae39c102b6557fce60a0984d7`
MD5	`a8cec52d488d49f0b311a2b4ffd6fd6c`
BLAKE2b-256	`e5b4a7d408410b68179f9120434639977fac3aea4aefa601ad25604133128cea`

See more details on using hashes here.

File details

Details for the file xlocllm-1.0.1-py3-none-any.whl.

File metadata

Download URL: xlocllm-1.0.1-py3-none-any.whl
Upload date: May 25, 2026
Size: 8.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for xlocllm-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ef26bea934ef929baf51231fd25ff201faee26c1ff5a3dfd9894a5348af59086`
MD5	`5ff3b92579812deba616cfe30de9b218`
BLAKE2b-256	`8649caf4159eb931fd12af2be575eba42ab2ca8455757dd0ffde2199318e7ea0`

See more details on using hashes here.

xlocllm 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

xlocllm

What It Does

Install

Quick Start

OpenAI-Compatible Usage

Core API

Documentation

Model Lookup

Local State

Development Checks

Notes

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes