Skip to main content

Dfloat11 plugin for vLLM

Project description

vLLM x DFloat11


📦 Installation

pip install vllm-df11

Dependencies:

  • vLLM >= 0.9.0
  • CUDA-compatible GPU (A100, H100, H200, RTX, etc.)

You do not need nvcc or a C/C++ compiler to install this package. However, a CUDA-enabled GPU is required to use dfloat11 with vLLM.


🚀 Usage

Enable the plugin by setting the environment variable:

import os

os.environ["VLLM_PLUGINS"] = "df11"

from vllm.plugins import load_general_plugins 

load_general_plugins()

from vllm import LLM, SamplingParams

df11_model_path = "/path/to/dfloat11/e.g./llama3.1-8b-it-df11"

llm = LLM(
    model=df11_model_path,
    load_format="df11",
    dtype="bfloat16"  
)

prompts = ["Explain Huffman coding and describe its applications."]

sampling_params = SamplingParams(
    temperature=0.6,
    top_p=.95,
    max_tokens=64,
)

outputs = llm.generate(prompts, sampling_params=sampling_params)

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt}")
    print(f"Generated: {generated_text}")
    print(" 🌳🦖🗜️📦🌲 " * 5)

📚 Reference

If you use this plugin in your research or deployment, please cite our paper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vllm_df11-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

vllm_df11-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

vllm_df11-0.0.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

vllm_df11-0.0.1-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file vllm_df11-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vllm_df11-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e7930d2d26ad49dd4fe3c0852f54f2e8f31df0d49874d5d3cccc72bf26b4b1b3
MD5 bfcc616ef7f52dd68bef1cb3887b6241
BLAKE2b-256 1b85a28fcf9c5182fc461f0f4c1198e68db3542739dd31d8d2e2e06980d3fbb7

See more details on using hashes here.

File details

Details for the file vllm_df11-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vllm_df11-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9fc7e75ad66218053ec6ccefc1d8a9c40e16985f537b9d60df75caef023c09d9
MD5 1f7c2179ec1bd0f1e54e843d97e58430
BLAKE2b-256 898338cc72daf900c3fd57402e04cc670630a79d0405ea68769e895f13af2923

See more details on using hashes here.

File details

Details for the file vllm_df11-0.0.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vllm_df11-0.0.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5245ee5972d17e24f8fc4c8fa442e2388d85a97d849598a09ab3237084a2f432
MD5 b7796790557729bd21e8b6ec0f902ea9
BLAKE2b-256 032de0b79903d7015a11a6a33c45ba0ab29670b2974ca9f8a86758746cf3c93b

See more details on using hashes here.

File details

Details for the file vllm_df11-0.0.1-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vllm_df11-0.0.1-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 830e0459d101d218634dcb959f301e5db3b4a6e6e62d1b2ec8a13d35ad9e82ab
MD5 5aaad0dc1c869974f70975b267f1e432
BLAKE2b-256 d8c1e11d483b084d1341eb9460dc75e5a7a1ba70340a8351be13f30a917edd93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page