Skip to main content

No project description provided

Project description

README.md

Introduction

This repository provides a template for deploying large language models (LLMs) using Docker Compose. The setup is designed to integrate multiple models with GPU support, Redis for data storage, and RabbitMQ for task queuing and processing. The provided main.py script demonstrates how to initialize and run a connection to process tasks using any model that inherits from the base model class.


Table of Contents

  1. Docker Compose Setup
  2. Main Script Overview
  3. Environment Variable Configuration

Docker Compose Setup

The provided docker-compose.yml template can be used to deploy your LLM model(s). It supports GPU execution and integration with a shared network (llm_wrap_network).

Example docker-compose.yml

version: '3.8'

services:
   llm:
      container_name: <your_container_name>
      image: <your_image_name>:latest
      runtime: nvidia
      deploy:
         resources:
            limits:
               memory: 100G
      build:
         context: ..
         dockerfile: Dockerfile
      env_file: .env
      environment:
         FORCE_CMAKE: 1
      volumes:
         - <your_path_to_data_in_docker>:/data
      ports:
         - "8677:8672"
      networks:
         - llm_wrap_network
      restart: unless-stopped

networks:
   llm_wrap_network:
      name: llm_wrap_network
      driver: bridge

Adding Multiple Models

You can define multiple models by duplicating the service block in docker-compose.yml and adjusting the relevant parameters (e.g., container name, ports, GPUs). For example:

services:
  llm_1:
    container_name: llm_model_1
    image: llm_image:latest
    runtime: nvidia
    environment:
      MODEL_PATH: /data/model_1
      NVIDIA_VISIBLE_DEVICES: "GPU-1"
    ports:
      - "8677:8672"

  llm_2:
    container_name: llm_model_2
    image: llm_image:latest
    runtime: nvidia
    environment:
      MODEL_PATH: /data/model_2
      NVIDIA_VISIBLE_DEVICES: "GPU-2"
    ports:
      - "8678:8672"

networks:
  llm_wrap_network:
    name: llm_wrap_network
    driver: bridge

By assigning separate GPUs and ports, you can scale your infrastructure to serve multiple models simultaneously.


Main Script Overview

The provided main.py script demonstrates how to initialize and run the LLM wrapper (LLMWrap) with a selected model. The wrapper uses RabbitMQ for task queuing and Redis for result storage.

Key Components

  1. Model Initialization:

    llm_model = VllMModel(model_path=MODEL_PATH)
    
    • Any model inheriting from BaseLLM can be used here.
    • Replace VllMModel with your custom model class if needed.
  2. LLMWrap Initialization:

    llm_wrap = LLMWrap(
        llm_model=llm_model,
        redis_host=REDIS_HOST,
        redis_port=REDIS_PORT,
        queue_name=QUEUE_NAME,
        rabbit_host=RABBIT_MQ_HOST,
        rabbit_port=RABBIT_MQ_PORT,
        rabbit_login=RABBIT_MQ_LOGIN,
        rabbit_password=RABBIT_MQ_PASSWORD,
        redis_prefix=REDIS_PREFIX
    )
    
    • This connects the model to the task queue and result storage.
    • Ensure the environment variables match the corresponding API configuration.
  3. Starting the Connection:

    llm_wrap.start_connection()
    
    • Begins consuming tasks from RabbitMQ and processes them using the selected model.
    • Results are saved to Redis.

Environment Variable Configuration

Environment variables are defined in the .env file and passed to Docker Compose. Below are the required variables:

TOKENS_LEN=16384
GPU_MEMORY_UTILISATION=0.9
TENSOR_PARALLEL_SIZE=2
MODEL_PATH=/data/<your_model_path>
NVIDIA_VISIBLE_DEVICES=<your_GPUs>
REDIS_HOST=localhost
REDIS_PORT=6379
QUEUE_NAME=<your_queue_name>
RABBIT_MQ_HOST=<rabbitmq_host>
RABBIT_MQ_PORT=<rabbitmq_port>
RABBIT_MQ_LOGIN=<rabbitmq_login>
RABBIT_MQ_PASSWORD=<rabbitmq_password>
REDIS_PREFIX=<redis_key_prefix>

Synchronization with API

If you are deploying an API in another container, ensure the following:

  1. Environment Variables:

    • Match the Redis and RabbitMQ configuration between the API and the LLM containers (e.g., REDIS_HOST, RABBIT_MQ_HOST).
  2. Network:

    • Both containers must be on the same Docker network (llm_wrap_network in this example).
  3. Shared Queues:

    • The QUEUE_NAME variable should be consistent across containers to ensure tasks are properly routed.

Running the System

  1. Setup Docker Compose:

    • Adjust docker-compose.yml and .env with your specific configuration.
    • Start the system:
      docker-compose up -d
      
  2. Verify Running Containers:

    • Check active containers:
      docker ps
      
  3. Monitor Logs:

    • To view logs for a specific container:
      docker logs -f <container_name>
      
  4. Submit Tasks:

    • Tasks can be submitted to the RabbitMQ queue, and results will be saved in Redis.

For more details, refer to the comments in docker-compose.yml and main.py. If you encounter any issues, feel free to open an issue in the repository!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protollm_worker-1.0.5.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

protollm_worker-1.0.5-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file protollm_worker-1.0.5.tar.gz.

File metadata

  • Download URL: protollm_worker-1.0.5.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.10.11 Windows/10

File hashes

Hashes for protollm_worker-1.0.5.tar.gz
Algorithm Hash digest
SHA256 8e4e5454e5fa9ae5367c52ba7758110989aaa928fcfa7dfcbdbf6cd0501812df
MD5 a3ac2134a6b0589b0879b1cf86720f36
BLAKE2b-256 80cfd37ef36ecd8dd91c644cf9dbc8ee252f79bf976a3556f62b0cf68f81da04

See more details on using hashes here.

File details

Details for the file protollm_worker-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: protollm_worker-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.10.11 Windows/10

File hashes

Hashes for protollm_worker-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8359d5af600ce71d16e8973bb7e6a748dea5e48bfdc0f14661de6d62d544a36d
MD5 2dee479dccb264373e885af5b12953d9
BLAKE2b-256 326d9bbdc287b376efa1fcd49f9f39ac1c2bed0c9b55a7adf80317c6257d0070

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page