Local browser-backed LLM and multimodal inference bridge
Project description
xlocllm
xlocllm is a Python SDK for local browser-backed AI inference. It exposes an
OpenAI-compatible loopback API from Python while running model weights in a
paired browser window through WebGPU/WebNN with MLC WebLLM and Transformers.js.
The goal is simple:
pip install xlocllm
Then:
import xlocllm
llm = xlocllm.unit("LLM", "Qwen-3.5-0.8b")
runtime = xlocllm.runtime([llm])
runtime.run()
print(runtime.url) # http://127.0.0.1:1146/v1
print(runtime.chat("Say hello", temperature=0))
What It Does
- Starts a local FastAPI bridge on
127.0.0.1. - Opens a paired browser app window.
- Runs models inside the browser runtime.
- Provides OpenAI-compatible
/v1endpoints for local clients. - Supports LLMs, embeddings, rerankers, translation, TTS, vision, ASR, and more through a shared catalog.
- Keeps Python-side objects for models, units, runtimes, and bridges.
Install
pip install xlocllm
Optional OpenAI client helper:
pip install "xlocllm[openai]"
Development install from this repository:
python -m pip install -e .\python\xlocllm[dev,openai]
Quick Start
import xlocllm
runtime = xlocllm.runtime(
[
xlocllm.unit("LLM", "Qwen-3.5-0.8b"),
xlocllm.unit("embedding", "multilingual-e5-small"),
]
)
runtime.install()
runtime.run()
print(runtime.status())
OpenAI-Compatible Usage
import xlocllm
from openai import OpenAI
runtime = xlocllm.runtime([xlocllm.unit("LLM", "Qwen-3.5-0.8b")])
runtime.run()
client = OpenAI(base_url=runtime.url, api_key="xlocllm")
response = client.chat.completions.create(
model="Qwen3.5-0.8B-q4f16_1-MLC",
messages=[{"role": "user", "content": "Say hello"}],
)
print(response.choices[0].message.content)
With the optional helper:
client = runtime.client()
Core API
model = xlocllm.model("Qwen-3.5-0.8b", unit="LLM")
models = xlocllm.models(unit="LLM", max_vram_mb=1500)
unit = xlocllm.unit("LLM", "Qwen-3.5-0.8b")
runtime = xlocllm.runtime([unit], port=1146)
bridge = xlocllm.Bridge(port=1146)
print(runtime.url)
print(bridge.url)
print(xlocllm.bridges())
print(xlocllm.runtimes())
print(xlocllm.status())
Documentation
- Full English SDK docs:
docs.md - Full Russian SDK docs:
docs_ru.md - English model catalog:
models.md - Russian model catalog:
models_ru.md
Model Lookup
Use exact modelId, label, or aliases:
xlocllm.unit("LLM", "Qwen-3.5-0.8b")
xlocllm.unit("LLM", "Qwen3.5-0.8B-q4f16_1-MLC")
xlocllm.unit("embedding", "multilingual-e5-small")
Browse the complete catalog in models.md.
Local State
By default, xlocllm stores bridge metadata and browser profiles under:
- Windows:
%LOCALAPPDATA%\xlocllm - Linux/macOS:
$XDG_STATE_HOME/xlocllmor~/.local/state/xlocllm
Environment variables:
XLOCLLM_HOME- override local state directory.XLOCLLM_WEB_URL- use a custom web runtime URL.XLOCLLM_LOG_LEVEL- uvicorn log level.
Development Checks
python -m pytest python/xlocllm/tests
python -m ruff check python/xlocllm/src python/xlocllm/tests
python -m mypy python/xlocllm/src
Build the Python package:
cd python\xlocllm
python -m build
Notes
The bridge binds to loopback only. The browser window must remain open while browser-backed models are running. Model availability depends on browser support for WebGPU/WebNN and the model's runtime backend.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xlocllm-1.0.0.tar.gz.
File metadata
- Download URL: xlocllm-1.0.0.tar.gz
- Upload date:
- Size: 8.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7ed5ce03d651395e867f2b0d3941909665987458c979b4bfe5a5643f117d462
|
|
| MD5 |
58e4d01945744bb41bc3fcae43360606
|
|
| BLAKE2b-256 |
b2f1f4c5181c08977df7abff2dae3b268d0376fa30ce71a3d21cc9982330e094
|
File details
Details for the file xlocllm-1.0.0-py3-none-any.whl.
File metadata
- Download URL: xlocllm-1.0.0-py3-none-any.whl
- Upload date:
- Size: 8.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e1a2a2805131ca3801c83dc74445dc8106cd0d5419037772d60c89d19cc0de0
|
|
| MD5 |
eeb37ad02191d7b82daa3cb7c01cd84d
|
|
| BLAKE2b-256 |
13941f57ea6779dcf587516d6c3a1d856843c24fd1d873e0779fd26831991b74
|