Skip to main content

vLLM plugin for Qwerky AI MambaInLlama hybrid models

Project description

Qwerky vLLM Models

A vLLM plugin for serving Qwerky AI's MambaInLlama hybrid models without the --trust-remote-code flag.

Installation

pip install vllm qwerky-vllm-models

Usage

After installing, serve Qwerky models with vLLM:

vllm serve QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill --max-model-len 4096

The plugin automatically registers the model architecture with vLLM on import.

Supported Models

  • QwerkyAI/Qwerky-Llama3.2-Mamba-3B-Llama3.3-70B-base-distill

How It Works

This package uses vLLM's plugin system (vllm.general_plugins entry point) to register the MambaInLlama model architecture. This means:

  • No fork of vLLM required
  • No --trust-remote-code flag needed
  • Works with standard vLLM installation
  • Uses vLLM's native Triton-accelerated Mamba kernels

Requirements

  • Python >= 3.10
  • vLLM >= 0.14.0
  • PyTorch >= 2.0.0

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwerky_vllm_models-0.2.1.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qwerky_vllm_models-0.2.1-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file qwerky_vllm_models-0.2.1.tar.gz.

File metadata

  • Download URL: qwerky_vllm_models-0.2.1.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qwerky_vllm_models-0.2.1.tar.gz
Algorithm Hash digest
SHA256 7f62950d1e1c58d7cab3b74688707fd8997afb3b42ee782680f2e3f33b52da33
MD5 6eb307ac8b57191538a595186f2cf859
BLAKE2b-256 6d223c409cf4fd8f40b52861d45ca88713d070cfe73d96549f6519987360984d

See more details on using hashes here.

File details

Details for the file qwerky_vllm_models-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for qwerky_vllm_models-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d92361346aae5f573a12d38811eb58a89da4d990e42be1667734f46dc0c9b845
MD5 1db7f70579f37396c3099063bf82b009
BLAKE2b-256 60156ff9ef51198ada70e820d5c833fffe689a3b760c447fb8122cbdadce1325

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page