A vLLM plugin to register the MERaLiON-2-10B model architecture with vLLM’s plugin system.
Project description
MERaLiON2 vLLM Plugin
Licence
Set up Environment
This plugin family has two release lines:
v0.1.x: compatibility lane for vLLM version0.6.5~0.7.3(V0 engine), and0.8.5~0.8.5.post1(V1 engine).v0.2.x: compatibility lane forvLLM >=0.8.5,<=0.10.0. Refer to matrix_summary.md for detailed vLLM + transformers compatibility.
Install by your vLLM version:
# For vLLM 0.6.5~0.7.3, 0.8.5.
pip install "vllm-plugin-meralion2<0.2"
# For vLLM 0.8.5 ~ 0.10.0
pip install "vllm-plugin-meralion2>=0.2,<0.3"
It's strongly recommended to install flash-attn for better memory and gpu utilization.
pip install flash-attn --no-build-isolation
Offline Inference
Refer to offline_example.py for offline inference example.
OpenAI-compatible Serving
Refer to openai_serve_example.sh for openAI-compatible serving example.
To call the server, you can refer to openai_client_example.py.
Alternatively, you can try calling the server with curl, refer to openai_client_curl.sh.
Full release history
See CHANGELOG.md.
vLLM + transformers compatibility
Security and dependency scanning
The repository uses separate workflows so each scan has a clear purpose:
Security (Bandit SAST)(.github/workflows/security.yml): static security linting of project Python source (bandit -r src).CodeQL(.github/workflows/codeql.yml): semantic code scanning for Python + GitHub Actions security issues.Dependency Audit (pip-audit)(.github/workflows/dependency-audit.yml): installed dependency vulnerability scanning.Dependency Review (PR)(.github/workflows/dependency-review.yml): checks dependency changes in pull requests and fails onmoderate+ severity vulnerabilities.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vllm_plugin_meralion2-0.2.1.tar.gz.
File metadata
- Download URL: vllm_plugin_meralion2-0.2.1.tar.gz
- Upload date:
- Size: 29.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2bd6a2e5ae7e7e55189bcfa16d44950b61dfa5b0a365a18f2acf850eb7d58f6
|
|
| MD5 |
038583da786bff25a855986aaa7699d9
|
|
| BLAKE2b-256 |
10c2b02592266fc9e35242392e421407cb32290530f16a4208a6bad0dc8de6dc
|
Provenance
The following attestation bundles were made for vllm_plugin_meralion2-0.2.1.tar.gz:
Publisher:
publish.yml on YingxuH/vllm_plugin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vllm_plugin_meralion2-0.2.1.tar.gz -
Subject digest:
e2bd6a2e5ae7e7e55189bcfa16d44950b61dfa5b0a365a18f2acf850eb7d58f6 - Sigstore transparency entry: 1523461761
- Sigstore integration time:
-
Permalink:
YingxuH/vllm_plugin@0ee27ebd7ea23aaa6312a7ebf5d0f2b1d029e820 -
Branch / Tag:
refs/heads/release/0.2.x - Owner: https://github.com/YingxuH
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0ee27ebd7ea23aaa6312a7ebf5d0f2b1d029e820 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file vllm_plugin_meralion2-0.2.1-py3-none-any.whl.
File metadata
- Download URL: vllm_plugin_meralion2-0.2.1-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac5060f2ccb7f266f077a8a943aa3dca738e4239e6e2b4abf9caf0d015e1431e
|
|
| MD5 |
2f40842ab9e8373f9f894415f7752379
|
|
| BLAKE2b-256 |
5ac7a17618edca7e7e4f7182d9c92defc27daf42af46496b0454cf7b3efde3ac
|
Provenance
The following attestation bundles were made for vllm_plugin_meralion2-0.2.1-py3-none-any.whl:
Publisher:
publish.yml on YingxuH/vllm_plugin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vllm_plugin_meralion2-0.2.1-py3-none-any.whl -
Subject digest:
ac5060f2ccb7f266f077a8a943aa3dca738e4239e6e2b4abf9caf0d015e1431e - Sigstore transparency entry: 1523461778
- Sigstore integration time:
-
Permalink:
YingxuH/vllm_plugin@0ee27ebd7ea23aaa6312a7ebf5d0f2b1d029e820 -
Branch / Tag:
refs/heads/release/0.2.x - Owner: https://github.com/YingxuH
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0ee27ebd7ea23aaa6312a7ebf5d0f2b1d029e820 -
Trigger Event:
workflow_dispatch
-
Statement type: