Skip to main content

Local browser-backed LLM and multimodal inference bridge

Project description

xlocllm

xlocllm is a Python SDK for local browser-backed AI inference. It exposes an OpenAI-compatible loopback API from Python while running model weights in a paired browser window through WebGPU/WebNN with MLC WebLLM and Transformers.js.

The goal is simple:

pip install xlocllm

Then:

import xlocllm

llm = xlocllm.unit("LLM", "Qwen-3.5-0.8b")
runtime = xlocllm.runtime([llm])
runtime.run()

print(runtime.url)  # http://127.0.0.1:1146/v1
print(runtime.chat("Say hello", temperature=0))

What It Does

  • Starts a local FastAPI bridge on 127.0.0.1.
  • Opens a paired browser app window.
  • Runs models inside the browser runtime.
  • Provides OpenAI-compatible /v1 endpoints for local clients.
  • Supports LLMs, embeddings, rerankers, translation, TTS, vision, ASR, and more through a shared catalog.
  • Keeps Python-side objects for models, units, runtimes, and bridges.

Install

pip install xlocllm

Optional OpenAI client helper:

pip install "xlocllm[openai]"

Development install from this repository:

python -m pip install -e .\python\xlocllm[dev,openai]

Quick Start

import xlocllm

runtime = xlocllm.runtime(
    [
        xlocllm.unit("LLM", "Qwen-3.5-0.8b"),
        xlocllm.unit("embedding", "multilingual-e5-small"),
    ]
)

runtime.install()
runtime.run()

print(runtime.status())

OpenAI-Compatible Usage

import xlocllm
from openai import OpenAI

runtime = xlocllm.runtime([xlocllm.unit("LLM", "Qwen-3.5-0.8b")])
runtime.run()

client = OpenAI(base_url=runtime.url, api_key="xlocllm")
response = client.chat.completions.create(
    model="Qwen3.5-0.8B-q4f16_1-MLC",
    messages=[{"role": "user", "content": "Say hello"}],
)
print(response.choices[0].message.content)

With the optional helper:

client = runtime.client()

Core API

model = xlocllm.model("Qwen-3.5-0.8b", unit="LLM")
models = xlocllm.models(unit="LLM", max_vram_mb=1500)

unit = xlocllm.unit("LLM", "Qwen-3.5-0.8b")
runtime = xlocllm.runtime([unit], port=1146)
bridge = xlocllm.Bridge(port=1146)

print(runtime.url)
print(bridge.url)
print(xlocllm.bridges())
print(xlocllm.runtimes())
print(xlocllm.status())

Documentation

Model Lookup

Use exact modelId, label, or aliases:

xlocllm.unit("LLM", "Qwen-3.5-0.8b")
xlocllm.unit("LLM", "Qwen3.5-0.8B-q4f16_1-MLC")
xlocllm.unit("embedding", "multilingual-e5-small")

Browse the complete catalog in models.md.

Local State

By default, xlocllm stores bridge metadata and browser profiles under:

  • Windows: %LOCALAPPDATA%\xlocllm
  • Linux/macOS: $XDG_STATE_HOME/xlocllm or ~/.local/state/xlocllm

Environment variables:

  • XLOCLLM_HOME - override local state directory.
  • XLOCLLM_WEB_URL - use a custom web runtime URL.
  • XLOCLLM_LOG_LEVEL - uvicorn log level.

Development Checks

python -m pytest python/xlocllm/tests
python -m ruff check python/xlocllm/src python/xlocllm/tests
python -m mypy python/xlocllm/src

Build the Python package:

cd python\xlocllm
python -m build

Notes

The bridge binds to loopback only. The browser window must remain open while browser-backed models are running. Model availability depends on browser support for WebGPU/WebNN and the model's runtime backend.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlocllm-1.0.0.tar.gz (8.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlocllm-1.0.0-py3-none-any.whl (8.3 MB view details)

Uploaded Python 3

File details

Details for the file xlocllm-1.0.0.tar.gz.

File metadata

  • Download URL: xlocllm-1.0.0.tar.gz
  • Upload date:
  • Size: 8.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for xlocllm-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d7ed5ce03d651395e867f2b0d3941909665987458c979b4bfe5a5643f117d462
MD5 58e4d01945744bb41bc3fcae43360606
BLAKE2b-256 b2f1f4c5181c08977df7abff2dae3b268d0376fa30ce71a3d21cc9982330e094

See more details on using hashes here.

File details

Details for the file xlocllm-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: xlocllm-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 8.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for xlocllm-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3e1a2a2805131ca3801c83dc74445dc8106cd0d5419037772d60c89d19cc0de0
MD5 eeb37ad02191d7b82daa3cb7c01cd84d
BLAKE2b-256 13941f57ea6779dcf587516d6c3a1d856843c24fd1d873e0779fd26831991b74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page