Skip to main content

A vLLM plugin to register the MERaLiON-2-10B model architecture with vLLM’s plugin system.

Project description

MERaLiON2 vLLM Plugin

Licence

MERaLiON-Public-Licence-v2

Set up Environment

This vLLM plugin for MERaLiON2 requires transformers version 4.50.1. It supports vLLM version 0.6.5 ~ 0.7.3 (V0 engine), and 0.8.5 ~ 0.8.5.post1 (V1 engine).

pip install transformers==4.50.1
pip install vllm==0.6.5

Install the MERaLiON2 vLLM plugin.

python install vllm-plugin-meralion2

It's strongly recommended to install flash-attn for better memory and gpu utilization.

pip install flash-attn --no-build-isolation

Offline Inference

Refer to offline_example.py for offline inference example.

OpenAI-compatible Serving

Refer to openai_serve_example.sh for openAI-compatible serving example.

To call the server, you can refer to openai_client_example.py.

Alternatively, you can try calling the server with curl, refer to openai_client_curl.sh.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_plugin_meralion2-0.1.2.post2.tar.gz (424.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_plugin_meralion2-0.1.2.post2-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file vllm_plugin_meralion2-0.1.2.post2.tar.gz.

File metadata

File hashes

Hashes for vllm_plugin_meralion2-0.1.2.post2.tar.gz
Algorithm Hash digest
SHA256 8fef65207508502b908b282d7ac3c864960a9fbb4c84c4513b264ff707d131cf
MD5 6dce64ea04ff16876b798c376db21527
BLAKE2b-256 12a4f23e3c3b910ace461c5fb81cf57a214ebe1ac2bd420c8ea63b3e5bb10c5e

See more details on using hashes here.

File details

Details for the file vllm_plugin_meralion2-0.1.2.post2-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_plugin_meralion2-0.1.2.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 58f3933a2441ce82e938f89babe617b53ce4392446cda6a10804962698c83de9
MD5 82768e5a3438601ced2c70c03b083aeb
BLAKE2b-256 ca33fdbcc6aafc470e9570b97988b3837a480d58487f072f7da68a6541b2baf1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page