Skip to main content

๐ŸŒŸ The Ultimate Multi-Model LLM Runtime Platform - Deploy, manage, and serve 300+ language models with OpenAI-compatible APIs. Built on ms-swift for production-ready performance.

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

๐ŸŒŸ PolarisLLM - AI Model Orchestration Platform

Deploy and manage 300+ AI models with simple commands

Transform your server into a powerful AI platform. Deploy models in the background, manage them with ease, and access everything through OpenAI-compatible APIs.

Version Python License


โœจ What You Get

๐Ÿš€ Background Model Deployment - Models run automatically in the background
๐ŸŽ›๏ธ Simple Management - Start, stop, and monitor with easy commands
๐Ÿ“Š Real-time Monitoring - See status, memory usage, and live logs
๐Ÿ”Œ OpenAI Compatible - Works with existing OpenAI code
๐ŸŒ 300+ Models - Qwen, Llama, DeepSeek, Mistral, and more


๐Ÿš€ Get Started

Install

pip install polarisllm --upgrade

Start Server

polarisllm start --daemon
๐ŸŒŸ PolarisLLM Runtime Engine
==================================================
๐Ÿš€ Starting PolarisLLM server in daemon mode...
   Host: 0.0.0.0
   Port: 7860
   Log File: /home/user/.polarisllm/logs/server.log

โœ… Server started successfully!
   PID: 12345
   URL: http://0.0.0.0:7860

๐Ÿ’ก Commands:
   polarisllm status              # Check server status
   polarisllm logs --server       # View server logs
   polarisllm stop --server       # Stop server

Deploy Your First Model

polarisllm deploy --model qwen2.5-7b-instruct
๐Ÿš€ Deploying model: qwen2.5-7b-instruct
๐Ÿ“‹ Using convenience shortcut for qwen2.5-7b-instruct
   Model Type: qwen2_5
   Model ID: Qwen/Qwen2.5-7B-Instruct

๐Ÿ“ก Allocated port: 8000
๐Ÿ”ง Command: swift deploy --model_type qwen2_5 --model Qwen/Qwen2.5-7B-Instruct --port 8000 --host 0.0.0.0
๐Ÿ“ Logs: /home/user/.polarisllm/logs/qwen2.5-7b-instruct.log

๐Ÿš€ Starting deployment in background...
โœ… Started process 12346 for qwen2.5-7b-instruct
โœ… Model deployment started successfully!
   Name: qwen2.5-7b-instruct
   PID: 12346
   Port: 8000
   Status: Initializing...

๐Ÿ” Monitor with: polarisllm logs qwen2.5-7b-instruct --follow
๐Ÿ“Š Check status: polarisllm status
๐ŸŒ Access via: http://localhost:7860/v1/chat/completions

Check What's Running

polarisllm list
๐Ÿ“‹ Deployed Models
========================================================================
NAME                    STATUS      PORT    MEMORY   UPTIME    TYPE
qwen2.5-7b-instruct     ๐ŸŸข Running  8000    15.2%    2.5h      qwen2_5

๐Ÿ“Š Summary:
   Total Models: 1
   Running: 1
   Stopped: 0

๐Ÿ’ก Commands:
   polarisllm logs qwen2.5-7b-instruct --follow  # View live logs
   polarisllm stop qwen2.5-7b-instruct           # Stop a model
   polarisllm status                              # Detailed status

๐ŸŽฎ Common Commands

Deploy Models

# Popular models (shortcuts available)
polarisllm deploy --model qwen2.5-7b-instruct
polarisllm deploy --model deepseek-coder-6.7b
polarisllm deploy --model mistral-7b-instruct

# Any model with full name
polarisllm deploy --model my-llama \
  --model-type llama3_1 \
  --model-id meta-llama/Meta-Llama-3.1-8B-Instruct

Sample deployment output:

๐Ÿš€ Deploying model: deepseek-coder-6.7b
๐Ÿ“‹ Using convenience shortcut for deepseek-coder-6.7b
   Model Type: deepseek
   Model ID: deepseek-ai/deepseek-coder-6.7b-instruct

๐Ÿ“ก Allocated port: 8001
๐Ÿš€ Starting deployment in background...
โœ… Model deployed successfully on port 8001!

Manage Your Models

polarisllm list                    # See all models
polarisllm status                  # System overview
polarisllm stop qwen2.5-7b-instruct    # Stop a model
polarisllm undeploy qwen2.5-7b-instruct # Remove completely

Status output:

$ polarisllm status
๐Ÿ“Š PolarisLLM System Status
============================================================
๐Ÿ–ฅ๏ธ  Server Status:
   Status: ๐ŸŸข Running (PID: 12345)
   Memory: 2.1%
   CPU: 0.5%
   API: ๐ŸŸข Healthy
   URL: http://localhost:7860

๐Ÿค– Models Status:
   Total Models: 2
   Running: 2 ๐ŸŸข
   Stopped: 0 ๐Ÿ”ด
   Detailed Status:
     qwen2.5-7b-instruct: ๐ŸŸข running
       Port: 8000, Memory: 15.2%, Uptime: 2.5h
     deepseek-coder-6.7b: ๐ŸŸข running
       Port: 8001, Memory: 12.8%, Uptime: 1.2h

๐Ÿ’พ Resource Status:
   Ports: 2/100 used (98 available)
   Range: 8000-8100
   Total Memory: 28.0% (all models combined)

๐Ÿ’ก Quick Commands:
   polarisllm deploy --model <name>     # Deploy a model
   polarisllm list                      # List all models
   polarisllm logs <model> --follow     # View live logs
   polarisllm stop <model>              # Stop a model
   polarisllm start --daemon            # Start server in background

Watch Logs

polarisllm logs qwen2.5-7b-instruct --follow    # Live logs
polarisllm logs --server --follow               # Server logs

Sample log output:

๐Ÿ“ Logs for model: qwen2.5-7b-instruct
   Lines: 100
   Follow: Yes
============================================================
๐Ÿ”„ Streaming logs (Press Ctrl+C to stop)...

[INFO:swift] Successfully registered model
[INFO:swift] rank: -1, local_rank: -1, world_size: 1
[INFO:swift] Loading the model using model_dir: /cache/Qwen2___5-7B-Instruct
[INFO:swift] Loading model weights...
[INFO:swift] Model loaded successfully
[INFO:swift] Server started on http://0.0.0.0:8000
[INFO:swift] Waiting for requests...

Server Control

polarisllm start --daemon     # Start in background
polarisllm stop --server      # Stop server
polarisllm restart           # Restart everything

๐Ÿค– Available Models

Popular Shortcuts:

  • qwen2.5-7b-instruct - Great all-around chat model
  • qwen2.5-14b-instruct - Larger version for better responses
  • deepseek-coder-6.7b - Excellent for programming
  • deepseek-vl-7b-chat - Understands images and text
  • mistral-7b-instruct - Fast and efficient
  • llama3.1-8b-instruct - Meta's latest model

Categories:

  • Chat: General conversation and Q&A
  • Code: Programming and development help
  • Vision: Image understanding and analysis
  • Audio: Speech and sound processing

See all 300+ models: python -m swift list-models


๐Ÿ”Œ Use with Your Code

Python

import openai

client = openai.OpenAI(
    base_url="http://localhost:7860/v1",
    api_key="not-required"
)

response = client.chat.completions.create(
    model="qwen2.5-7b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Sample response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "qwen2.5-7b-instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! I'm an AI assistant powered by PolarisLLM. How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 20,
    "total_tokens": 29
  }
}

JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
    baseURL: 'http://localhost:7860/v1',
    apiKey: 'not-required'
});

const completion = await client.chat.completions.create({
    model: 'qwen2.5-7b-instruct',
    messages: [{ role: 'user', content: 'Hello!' }]
});

cURL

curl -X POST "http://localhost:7860/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-7b-instruct",
    "messages": [{"role": "user", "content": "Write a Python function to add two numbers"}]
  }'

Sample cURL response:

{
  "id": "chatcmpl-456",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "qwen2.5-7b-instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Here's a simple Python function to add two numbers:\n\n```python\ndef add_numbers(a, b):\n    return a + b\n\n# Example usage\nresult = add_numbers(5, 3)\nprint(result)  # Output: 8\n```"
    },
    "finish_reason": "stop"
  }]
}

๐Ÿ“Š See What's Running

$ polarisllm list
๐Ÿ“‹ Deployed Models
========================================================================
NAME                    STATUS      PORT    MEMORY   UPTIME    TYPE
qwen2.5-7b-instruct     ๐ŸŸข Running  8000    15.2%    2.5h      qwen2_5
deepseek-coder-6.7b     ๐ŸŸข Running  8001    12.8%    1.2h      deepseek
mistral-7b-instruct     ๐Ÿ”ด Stopped  8002    N/A      N/A       mistral
$ polarisllm status
๐Ÿ“Š System Status
===========================================
๐Ÿ–ฅ๏ธ  Server: ๐ŸŸข Running at http://localhost:7860
๐Ÿค– Models: 2 running, 1 stopped
๐Ÿ’พ Resources: 2/100 ports used, 28% memory

๐Ÿšซ Fix Common Issues

Model won't start?

polarisllm logs <model-name>    # Check what went wrong
polarisllm cleanup              # Clean up any stuck processes

Sample error log:

๐Ÿ“ Logs for model: qwen2.5-7b-instruct
============================================================
[ERROR:swift] CUDA out of memory. Tried to allocate 2.0 GiB
[INFO:swift] Try reducing batch size or using a smaller model
[ERROR:swift] Model loading failed

Server not working?

polarisllm status               # See what's happening
polarisllm restart              # Restart everything

Need to free up space?

polarisllm stop --all           # Stop all models
polarisllm cleanup              # Clean up old processes

Sample cleanup output:

๐Ÿงน Cleaning up PolarisLLM...
========================================
๐Ÿ” Cleaning up dead processes...
๐Ÿ” Cleaning up dead models...
๐Ÿ” Cleaning up port allocations...
Cleaned up 2 dead port allocations
๐Ÿ” Cleaning up old logs...
Cleaned up 3 old log files

โœ… Cleanup completed!
๐Ÿ’ก Use 'polarisllm status' to verify system state

๐Ÿ’ก Pro Tips

  • Models run in background automatically - they survive terminal restarts
  • Use --follow with logs to watch models start up in real-time
  • Each model gets its own port (8000, 8001, 8002...)
  • Server remembers your models even after restarts
  • Use shortcuts for popular models, full names for everything else

๐Ÿค Need Help?


๐Ÿ“„ License

MIT License - Free to use for any purpose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polarisllm-2.0.3.tar.gz (45.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polarisllm-2.0.3-py3-none-any.whl (49.1 kB view details)

Uploaded Python 3

File details

Details for the file polarisllm-2.0.3.tar.gz.

File metadata

  • Download URL: polarisllm-2.0.3.tar.gz
  • Upload date:
  • Size: 45.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for polarisllm-2.0.3.tar.gz
Algorithm Hash digest
SHA256 60afa0655a18fc5497afd73e679572e2905575ddf267a3bd52fbb3c612d47be1
MD5 614def36ad499df6c9b818de9a827987
BLAKE2b-256 a8a25d1cfe894bda44f82ce4c659ead9d1b22c75fc65854ec5b9ac2686772f01

See more details on using hashes here.

File details

Details for the file polarisllm-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: polarisllm-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 49.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for polarisllm-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4a58fd78b12c98f317db4c36fcec3314bdf027a56708fd8f5d98bc819cd0d4fc
MD5 31021027c5e9eff509984d6dded297ab
BLAKE2b-256 12498b1fbcb9d42f5c07318c21249fdb53011fd99671561af21c99b8090f9db9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page