Skip to main content

vLLM plugin for Qwerky AI MambaInLlama hybrid models

Project description

Qwerky vLLM Models

A vLLM plugin for serving Qwerky AI's MambaInLlama hybrid models without the --trust-remote-code flag.

Installation

pip install vllm qwerky-vllm-models

Usage

After installing, serve Qwerky models with vLLM:

vllm serve QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill --max-model-len 4096

The plugin automatically registers the model architecture with vLLM on import.

Supported Models

  • QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill

How It Works

This package uses vLLM's plugin system (vllm.general_plugins entry point) to register the MambaInLlama model architecture. This means:

  • No fork of vLLM required
  • No --trust-remote-code flag needed
  • Works with standard vLLM installation
  • Uses vLLM's native Triton-accelerated Mamba kernels

Requirements

  • Python >= 3.10
  • vLLM >= 0.14.0
  • PyTorch >= 2.0.0

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwerky_vllm_models-0.2.4.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qwerky_vllm_models-0.2.4-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file qwerky_vllm_models-0.2.4.tar.gz.

File metadata

  • Download URL: qwerky_vllm_models-0.2.4.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qwerky_vllm_models-0.2.4.tar.gz
Algorithm Hash digest
SHA256 42384018eb068e8ea14646e46889494b1a379dbb1fd2770902f2fa26a3a00dda
MD5 94e5eda09e0e74534e843e6109f74c16
BLAKE2b-256 5de023fee225c1719644394210c7aec60875cee857bf5ee5f9ce9444be8ada5e

See more details on using hashes here.

File details

Details for the file qwerky_vllm_models-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for qwerky_vllm_models-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7cad60eaaf2bceaa12b87b0e6d948569b2ed9a9c6463519bdcc8768c828b401f
MD5 55d317bc7ed4727c5687993c19851e92
BLAKE2b-256 3c20c58d1fc91a59d8ba0ff3c0804ea5e9958419929ac398fb2801d8e03fff8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page