Skip to main content

A vLLM plugin to register the MERaLiON-2-10B model architecture with vLLM’s plugin system.

Project description

MERaLiON2 vLLM Plugin

Licence

MERaLiON-Public-Licence-v2

Set up Environment

This vLLM plugin for MERaLiON2 requires transformers version 4.50.1. It supports vLLM version 0.6.5 ~ 0.7.3 (V0 engine), and 0.8.5 ~ 0.8.5.post1 (V1 engine).

pip install transformers==4.50.1
pip install vllm==0.6.5

Install the MERaLiON2 vLLM plugin.

python install vllm-plugin-meralion2

It's strongly recommended to install flash-attn for better memory and gpu utilization.

pip install flash-attn --no-build-isolation

Offline Inference

Refer to offline_example.py for offline inference example.

OpenAI-compatible Serving

Refer to openai_serve_example.sh for openAI-compatible serving example.

To call the server, you can refer to openai_client_example.py.

Alternatively, you can try calling the server with curl, refer to openai_client_curl.sh.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_plugin_meralion2-0.1.2.post2.dev3.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file vllm_plugin_meralion2-0.1.2.post2.dev3.tar.gz.

File metadata

File hashes

Hashes for vllm_plugin_meralion2-0.1.2.post2.dev3.tar.gz
Algorithm Hash digest
SHA256 2982eca03204bb16c6975de96049dab02ea67b0703019e67446434847bf9bfd3
MD5 aef9b0fa5c5a1235d2693e266e80c8c2
BLAKE2b-256 4a91c819e2cc23b6909047ee6e40383da3262e1270e539a9a0438cad0e4cae4b

See more details on using hashes here.

File details

Details for the file vllm_plugin_meralion2-0.1.2.post2.dev3-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_plugin_meralion2-0.1.2.post2.dev3-py3-none-any.whl
Algorithm Hash digest
SHA256 ce5c0acecabc1171ee36d430cb8c9b0c587655e49d0e3c48463e59994d680dfd
MD5 d83b9a4e6665a244f2b99b04f5e43e90
BLAKE2b-256 fd2408e559df964f743a7c1f72b158c51be6ee8cb8f5bd65f4697b207b97a96b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page