Skip to main content

A vLLM plugin to register the MERaLiON-2-10B model architecture with vLLM’s plugin system.

Project description

MERaLiON2 vLLM Plugin

Licence

MERaLiON-Public-Licence-v2

Set up Environment

This vLLM plugin for MERaLiON2 requires transformers version 4.50.1. It supports vLLM version 0.6.5 ~ 0.7.3 (V0 engine), and 0.8.5 ~ 0.8.5.post1 (V1 engine).

pip install transformers==4.50.1
pip install vllm==0.6.5

Install the MERaLiON2 vLLM plugin.

python install vllm-plugin-meralion2

It's strongly recommended to install flash-attn for better memory and gpu utilization.

pip install flash-attn --no-build-isolation

Offline Inference

Refer to offline_example.py for offline inference example.

OpenAI-compatible Serving

Refer to openai_serve_example.sh for openAI-compatible serving example.

To call the server, you can refer to openai_client_example.py.

Alternatively, you can try calling the server with curl, refer to openai_client_curl.sh.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_plugin_meralion2-0.1.2.post2.dev1.tar.gz (424.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file vllm_plugin_meralion2-0.1.2.post2.dev1.tar.gz.

File metadata

File hashes

Hashes for vllm_plugin_meralion2-0.1.2.post2.dev1.tar.gz
Algorithm Hash digest
SHA256 084ce7e551413ae45a094ea80a82dda9f3a33db11fb1c54fa04faf54ec20a234
MD5 b6a24f492a883d908ef7d6f2df98b163
BLAKE2b-256 62ea0a5f8310d1a075db2b15fde7747eb2cfe3d80cdfef131aeb5a868a5e2db6

See more details on using hashes here.

File details

Details for the file vllm_plugin_meralion2-0.1.2.post2.dev1-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_plugin_meralion2-0.1.2.post2.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 3f96b1191788e219e79caf999c2add1a168dfe37f62390486e71eccb53c45ec0
MD5 5166314d4deac2475041e809822326a6
BLAKE2b-256 49e306e6d1d2afc98de3f8def6c990db0c8fefdc19019056d52e65fdaaaaa975

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page