Skip to main content

vLLM plugin for Qwerky AI MambaInLlama hybrid models

Project description

Qwerky vLLM Models

A vLLM plugin for serving Qwerky AI's MambaInLlama hybrid models without the --trust-remote-code flag.

Installation

pip install vllm qwerky-vllm-models

Usage

After installing, serve Qwerky models with vLLM:

vllm serve QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill --max-model-len 4096

The plugin automatically registers the model architecture with vLLM on import.

Supported Models

  • QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill

How It Works

This package uses vLLM's plugin system (vllm.general_plugins entry point) to register the MambaInLlama model architecture. This means:

  • No fork of vLLM required
  • No --trust-remote-code flag needed
  • Works with standard vLLM installation
  • Uses vLLM's native Triton-accelerated Mamba kernels

Requirements

  • Python >= 3.10
  • vLLM >= 0.14.0
  • PyTorch >= 2.0.0

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwerky_vllm_models-0.2.2.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qwerky_vllm_models-0.2.2-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file qwerky_vllm_models-0.2.2.tar.gz.

File metadata

  • Download URL: qwerky_vllm_models-0.2.2.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qwerky_vllm_models-0.2.2.tar.gz
Algorithm Hash digest
SHA256 5fc801d166d68ef95f2830613191d7fa99a7f6d4b513a3dbde7f5f825d9cdd3f
MD5 593af039153be34302c9e0a2935ece30
BLAKE2b-256 afacbe2cd0b981619f589e84201e79de8d69e958f2f3d49c2500b3ad2919d85a

See more details on using hashes here.

File details

Details for the file qwerky_vllm_models-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for qwerky_vllm_models-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a6a4d4e1d55ef4999a6e892fdd646e94ca8f23e1281cfed4936f4601cb5c3020
MD5 8390e965be45e54f2067060e44da6168
BLAKE2b-256 272a50c1d8ee01bb68fb194d67c83146da9e38665d58b68580a8d1b6704a29c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page