Skip to main content

Python bindings for OxiLLaMa — Pure Rust LLM inference engine

Project description

oxillama-py

Python bindings for OxiLLaMa — high-performance LLM inference from Python.

Part of the OxiLLaMa workspace — a Pure Rust LLM inference engine.

What It Provides

  • Engine — load a GGUF model and generate text; releases the GIL during inference
  • SpeculativeEngine — draft + target model pair for faster generation
  • LoadedLora — load a LoRA adapter and hot-swap it onto an Engine
  • Full Python type annotations and docstrings
  • Wheels built with maturin

Installation

pip install maturin
maturin develop --release          # in-place development install
# or
maturin build --release            # build a wheel
pip install target/wheels/oxillama_py-*.whl

Usage

import oxillama_py as ox

# Load model
engine = ox.Engine("llama-3.2-3b.Q4_K_M.gguf")

# Basic generation (GIL is released during the Rust inference call)
output = engine.generate(
    prompt="Tell me about the Rust programming language.",
    max_new_tokens=256,
    temperature=0.8,
    top_p=0.95,
)
print(output)

# Speculative decoding: 3-8x faster on large models
draft   = ox.Engine("llama-3.2-1b.Q4_K_M.gguf")
target  = ox.Engine("llama-3.2-8b.Q4_K_M.gguf")
spec    = ox.SpeculativeEngine(draft=draft, target=target, gamma=4)
output  = spec.generate("Once upon a time", max_new_tokens=512)
print(output)

# LoRA adapter
lora   = ox.LoadedLora("my-adapter.gguf")
engine.apply_lora(lora)
output = engine.generate("Write a haiku.", max_new_tokens=64)
engine.remove_lora()

License

Apache-2.0 — COOLJAPAN OU (Team Kitasan)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oxillama-0.1.0.tar.gz (404.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

oxillama-0.1.0-cp38-abi3-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.8+Windows x86-64

oxillama-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

oxillama-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

oxillama-0.1.0-cp38-abi3-macosx_11_0_arm64.whl (1.7 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

oxillama-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file oxillama-0.1.0.tar.gz.

File metadata

  • Download URL: oxillama-0.1.0.tar.gz
  • Upload date:
  • Size: 404.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oxillama-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c63bfe4c37b8636558a36028a6b60989654629b768d7e990e37183dd111797b1
MD5 150af025d1e58b16006a58e064880bc8
BLAKE2b-256 ff535b232342047460115621671d7a12317bb1eb0ecfcf7ad29098ae7cb3a7e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxillama-0.1.0.tar.gz:

Publisher: pypi-publish.yml on cool-japan/oxillama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxillama-0.1.0-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: oxillama-0.1.0-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oxillama-0.1.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 f4d7ac3f4163ff600de24b125bb44b231bb4d207ecbc17cb4a8599666a9d970e
MD5 6735dc07339c9d32c4b4dc84707b4cb5
BLAKE2b-256 5c14da02c9c82c47365c7609a4e497d4cc1a8814245a73555fd6a8e661eb8f9d

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxillama-0.1.0-cp38-abi3-win_amd64.whl:

Publisher: pypi-publish.yml on cool-japan/oxillama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxillama-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for oxillama-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e67805ccca3a79ac067ccdb6a44e841af1eec853b8f1877f4aba0aab4445bbe7
MD5 e810b699cd5b4fd56ad0a6f958e202a1
BLAKE2b-256 3055cb5f026275c27f65a06c6a403869282f75d9339688cd06caa46c6dcc4b61

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxillama-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: pypi-publish.yml on cool-japan/oxillama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxillama-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for oxillama-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6481aa3b1c868f292ba0ff1156c9c61d5658b2cc22f7471694b4365ee13d62cc
MD5 cee263ef59ad906ff7d1e1de60770403
BLAKE2b-256 6b839602ee810c4e843a290f8da3ee08551b0092e57be1f36a3d9e1acbd4d40f

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxillama-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: pypi-publish.yml on cool-japan/oxillama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxillama-0.1.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for oxillama-0.1.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9eb904c6def83bbb43fc07f4b962ba740db32b9d867e0d0e9122af0f3f46671d
MD5 ae06502d7cd4a7111cb2a39bf3f2ed69
BLAKE2b-256 9951e68c46f13e6bb21f36096772279b2ab325938a7c6e18e5e9322b911bec8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxillama-0.1.0-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: pypi-publish.yml on cool-japan/oxillama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oxillama-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for oxillama-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 54644424c679042263da4c970cf26c9610f54aba8e9e752450814ecc823063c9
MD5 1c1f0e05b51e56b68d91b28fa358f520
BLAKE2b-256 7df2b5a3091a8bfbd89f6c51000383a591af53208e0d7863b6c07f8cb05ec364

See more details on using hashes here.

Provenance

The following attestation bundles were made for oxillama-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: pypi-publish.yml on cool-japan/oxillama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page