Skip to main content

LLMHub is a lightweight management platform designed to streamline working with LLMs.

Project description

LLMHub Project

Overview

LLMHub is a lightweight management platform designed to streamline the operation and interaction with various language models (LLMs). It provides an intuitive command-line interface (CLI) and a RESTful API to manage, start, stop, and interact with LLMs. The platform supports running multiple models with different configurations and context sizes, allowing dynamic scaling and efficient use of resources.

Features

  • Model Management: Start, stop, and monitor multiple LLM processes with different configurations.
  • Dynamic Context Management: Automatically route requests to the most suitable model instance based on the required context size.
  • API Gateway: Provides OpenAI-compatible endpoints for completions and chat completions, making it easy to integrate with existing applications.
  • Modular Design: Extensible architecture with clear separation of concerns, allowing easy modification and expansion.
  • Persistent State Management: Keeps track of running processes, allowing for smooth restarts and state recovery.
  • Logging: Detailed logging for process management and interactions.

Installation

  1. Clone the Repository

    git clone https://github.com/jmather/llmhub.git
    cd llmhub
    
  2. Set Up the Virtual Environment

    python3 -m venv .venv
    source .venv/bin/activate
    
  3. Install Dependencies

    pip install -r requirements.txt
    
  4. Configure LLMHub

    • Edit the config.yaml file to define your models, engines, and other settings. You can define multiple models, each with different quantizations and context sizes.

    Example:

    on_start:
      MythoMax-L2-13B:
        quant: Q5_K_M
        engine: llamacppserver
        context_size: [512, 1024, 2048]
    
    port: 8080
    enable_proxy: true
    engine_port_min: 8081
    engine_port_max: 10000
    
    engines:
      llamacppserver:
        path: /path/to/llama-server
        arguments: --color -t 20 --parallel 2 --mlock --metrics --verbose
        model_flag: "-m"
        context_size_flag: "-c"
        port_flag: "--port"
        file_types: [gguf]
    

Usage

Command-Line Interface (CLI)

LLMHub provides a set of commands to manage models and interact with them:

  1. Start a Model

    python llmhub.py start MythoMax-L2-13B
    
  2. Stop a Model

    python llmhub.py stop MythoMax-L2-13B
    
  3. List Running Models

    python llmhub.py list_models
    
  4. Update Processes

    python llmhub.py update
    
  5. View Logs

    python llmhub.py logs MythoMax-L2-13B
    

REST API

LLMHub exposes a RESTful API with endpoints compatible with OpenAI's API, allowing seamless integration into existing applications.

  • List Models

    GET /v1/models
    
  • Create a Completion

    POST /v1/completions
    

    Example Payload:

    {
        "model": "MythoMax-L2-13B",
        "prompt": "Once upon a time,",
        "max_tokens": 100,
        "temperature": 0.7
    }
    
  • Create a Chat Completion

    POST /v1/chat/completions
    

Development

Directory Structure

  • llmhub_lib/: Contains the core libraries for configuration management, state management, process management, and model management.
  • cli/: Contains CLI commands that interact with the core libraries.
  • web_server.py: Flask-based web server that provides the REST API.
  • config.yaml: Configuration file for defining models, engines, and other settings.

Extending LLMHub

LLMHub's modular design allows easy extension. You can add new engines, modify process management logic, or integrate additional logging or monitoring tools. The AppDependencyContainer makes it easy to inject dependencies and add new components.

Testing

You can use tools like Postman to test the API. Import the provided Postman configuration to get started quickly.

Running Tests

Ensure that your environment is set up correctly with all dependencies installed. Use the CLI to start the necessary processes, then run tests against the API endpoints using Postman or cURL.

Troubleshooting

  • Model Not Found: Ensure that the model is defined correctly in config.yaml and that the required files are in place.
  • Port Conflicts: Adjust the engine_port_min and engine_port_max settings in config.yaml to avoid conflicts.
  • Process Failures: Check the logs for detailed error messages. Logs are stored in ~/.llmhub/logs.

Contributions

Contributions are welcome! Feel free to submit issues, feature requests, or pull requests to improve LLMHub.

License

This project is licensed under the MIT License. See the LICENSE file for details.


This README provides an overview and instructions for getting started with LLMHub. For more detailed documentation, refer to the code comments and configuration files.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmhub_cli-0.1.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

llmhub_cli-0.1.0-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file llmhub_cli-0.1.0.tar.gz.

File metadata

  • Download URL: llmhub_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for llmhub_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2f14c134d5788cc333ef647bec7efaf989b1bdc3ff35b9082f959dacff3df0b5
MD5 231edabc24697c5faf8536a913985511
BLAKE2b-256 e4b6661de2e61b1436543c46537cadfb0973c6a84db11211ec4aec6973834feb

See more details on using hashes here.

File details

Details for the file llmhub_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llmhub_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for llmhub_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aade1eb4c0b6a2ec0e3cfbcb2d0637d5beddfa822ef9c30dd1afb0a075536da5
MD5 5deb2d528ac0c23bbc4b807d443d3b1d
BLAKE2b-256 24796ca263fa6a09f0789c7ccb5212c0a21fdf7435ce268c34bbc3410f6b0232

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page