Dfloat11 plugin for vLLM
Project description
vLLM x DFloat11
📦 Installation
pip install vllm-df11
Dependencies:
- vLLM >= 0.9.0
- CUDA-compatible GPU (A100, H100, H200, RTX, etc.)
You do not need nvcc or a C/C++ compiler to install this package.
However, a CUDA-enabled GPU is required to use dfloat11 with vLLM.
🚀 Usage
Enable the plugin by setting the environment variable:
import os
os.environ["VLLM_PLUGINS"] = "df11"
from vllm.plugins import load_general_plugins
load_general_plugins()
from vllm import LLM, SamplingParams
df11_model_path = "/path/to/dfloat11/e.g./llama3.1-8b-it-df11"
llm = LLM(
model=df11_model_path,
load_format="df11",
dtype="bfloat16"
)
prompts = ["Explain Huffman coding and describe its applications."]
sampling_params = SamplingParams(
temperature=0.6,
top_p=.95,
max_tokens=64,
)
outputs = llm.generate(prompts, sampling_params=sampling_params)
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt}")
print(f"Generated: {generated_text}")
print(" 🌳🦖🗜️📦🌲 " * 5)
📚 Reference
If you use this plugin in your research or deployment, please cite our paper.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vllm_df11-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: vllm_df11-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.13, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7930d2d26ad49dd4fe3c0852f54f2e8f31df0d49874d5d3cccc72bf26b4b1b3
|
|
| MD5 |
bfcc616ef7f52dd68bef1cb3887b6241
|
|
| BLAKE2b-256 |
1b85a28fcf9c5182fc461f0f4c1198e68db3542739dd31d8d2e2e06980d3fbb7
|
File details
Details for the file vllm_df11-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: vllm_df11-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.12, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fc7e75ad66218053ec6ccefc1d8a9c40e16985f537b9d60df75caef023c09d9
|
|
| MD5 |
1f7c2179ec1bd0f1e54e843d97e58430
|
|
| BLAKE2b-256 |
898338cc72daf900c3fd57402e04cc670630a79d0405ea68769e895f13af2923
|
File details
Details for the file vllm_df11-0.0.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: vllm_df11-0.0.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.11, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5245ee5972d17e24f8fc4c8fa442e2388d85a97d849598a09ab3237084a2f432
|
|
| MD5 |
b7796790557729bd21e8b6ec0f902ea9
|
|
| BLAKE2b-256 |
032de0b79903d7015a11a6a33c45ba0ab29670b2974ca9f8a86758746cf3c93b
|
File details
Details for the file vllm_df11-0.0.1-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: vllm_df11-0.0.1-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.9, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
830e0459d101d218634dcb959f301e5db3b4a6e6e62d1b2ec8a13d35ad9e82ab
|
|
| MD5 |
5aaad0dc1c869974f70975b267f1e432
|
|
| BLAKE2b-256 |
d8c1e11d483b084d1341eb9460dc75e5a7a1ba70340a8351be13f30a917edd93
|