Skip to main content

vLLM plugin for Qwerky AI MambaInLlama hybrid models

Project description

Qwerky vLLM Models

A vLLM plugin for serving Qwerky AI's MambaInLlama hybrid models without the --trust-remote-code flag.

Installation

pip install vllm qwerky-vllm-models

Usage

After installing, serve Qwerky models with vLLM:

vllm serve QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill --max-model-len 4096

The plugin automatically registers the model architecture with vLLM on import.

Supported Models

  • QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill

How It Works

This package uses vLLM's plugin system (vllm.general_plugins entry point) to register the MambaInLlama model architecture. This means:

  • No fork of vLLM required
  • No --trust-remote-code flag needed
  • Works with standard vLLM installation
  • Uses vLLM's native Triton-accelerated Mamba kernels

Requirements

  • Python >= 3.10
  • vLLM >= 0.14.0
  • PyTorch >= 2.0.0

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwerky_vllm_models-0.2.5.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qwerky_vllm_models-0.2.5-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file qwerky_vllm_models-0.2.5.tar.gz.

File metadata

  • Download URL: qwerky_vllm_models-0.2.5.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qwerky_vllm_models-0.2.5.tar.gz
Algorithm Hash digest
SHA256 15d1c547f2bdcb77c78013ebdc8c0150e08879a3a751ee88db55a13e6529f3e3
MD5 ffa146651253d91ee9162a2294465211
BLAKE2b-256 88ea5c5fda63688f9dde9ca2717b8a50e6a363bb77f2163e7744017b97debff1

See more details on using hashes here.

File details

Details for the file qwerky_vllm_models-0.2.5-py3-none-any.whl.

File metadata

File hashes

Hashes for qwerky_vllm_models-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 231a654a368592e2a845206e9b3b9797bd55414cb705f7b8ff9b32d2295986f3
MD5 0121651f8a6e971ee6e06a6b7c2af6c8
BLAKE2b-256 3660564b433e4c48de787ce8f0fb9186ee2289350dfcd1609f40e5b61cd07a05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page