Skip to main content

๐ŸŒŸ The Ultimate Multi-Model LLM Runtime Platform - Deploy, manage, and serve 300+ language models with OpenAI-compatible APIs. Built on ms-swift for production-ready performance.

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

๐ŸŒŸ PolarisLLM - AI Model Orchestration Platform

Deploy and manage 300+ AI models with simple commands

Transform your server into a powerful AI platform. Deploy models in the background, manage them with ease, and access everything through OpenAI-compatible APIs.

Version Python License


โœจ What You Get

๐Ÿš€ Background Model Deployment - Models run automatically in the background
๐ŸŽ›๏ธ Simple Management - Start, stop, and monitor with easy commands
๐Ÿ“Š Real-time Monitoring - See status, memory usage, and live logs
๐Ÿ”Œ OpenAI Compatible - Works with existing OpenAI code
๐ŸŒ 300+ Models - Qwen, Llama, DeepSeek, Mistral, and more


๐Ÿš€ Get Started

Install

pip install polarisllm --upgrade

Start Server

polarisllm start --daemon
๐ŸŒŸ PolarisLLM Runtime Engine
==================================================
๐Ÿš€ Starting PolarisLLM server in daemon mode...
   Host: 0.0.0.0
   Port: 7860
   Log File: /home/user/.polarisllm/logs/server.log

โœ… Server started successfully!
   PID: 12345
   URL: http://0.0.0.0:7860

๐Ÿ’ก Commands:
   polarisllm status              # Check server status
   polarisllm logs --server       # View server logs
   polarisllm stop --server       # Stop server

Deploy Your First Model

polarisllm deploy --model qwen2.5-7b-instruct
๐Ÿš€ Deploying model: qwen2.5-7b-instruct
๐Ÿ“‹ Using convenience shortcut for qwen2.5-7b-instruct
   Model Type: qwen2_5
   Model ID: Qwen/Qwen2.5-7B-Instruct

๐Ÿ“ก Allocated port: 8000
๐Ÿ”ง Command: swift deploy --model_type qwen2_5 --model Qwen/Qwen2.5-7B-Instruct --port 8000 --host 0.0.0.0
๐Ÿ“ Logs: /home/user/.polarisllm/logs/qwen2.5-7b-instruct.log

๐Ÿš€ Starting deployment in background...
โœ… Started process 12346 for qwen2.5-7b-instruct
โœ… Model deployment started successfully!
   Name: qwen2.5-7b-instruct
   PID: 12346
   Port: 8000
   Status: Initializing...

๐Ÿ” Monitor with: polarisllm logs qwen2.5-7b-instruct --follow
๐Ÿ“Š Check status: polarisllm status
๐ŸŒ Access via: http://localhost:7860/v1/chat/completions

Check What's Running

polarisllm list
๐Ÿ“‹ Deployed Models
========================================================================
NAME                    STATUS      PORT    MEMORY   UPTIME    TYPE
qwen2.5-7b-instruct     ๐ŸŸข Running  8000    15.2%    2.5h      qwen2_5

๐Ÿ“Š Summary:
   Total Models: 1
   Running: 1
   Stopped: 0

๐Ÿ’ก Commands:
   polarisllm logs qwen2.5-7b-instruct --follow  # View live logs
   polarisllm stop qwen2.5-7b-instruct           # Stop a model
   polarisllm status                              # Detailed status

๐ŸŽฎ Common Commands

Deploy Models

# Popular models (shortcuts available)
polarisllm deploy --model qwen2.5-7b-instruct
polarisllm deploy --model deepseek-coder-6.7b
polarisllm deploy --model mistral-7b-instruct

# Any model with full name
polarisllm deploy --model my-llama \
  --model-type llama3_1 \
  --model-id meta-llama/Meta-Llama-3.1-8B-Instruct

Sample deployment output:

๐Ÿš€ Deploying model: deepseek-coder-6.7b
๐Ÿ“‹ Using convenience shortcut for deepseek-coder-6.7b
   Model Type: deepseek
   Model ID: deepseek-ai/deepseek-coder-6.7b-instruct

๐Ÿ“ก Allocated port: 8001
๐Ÿš€ Starting deployment in background...
โœ… Model deployed successfully on port 8001!

Manage Your Models

polarisllm list                    # See all models
polarisllm status                  # System overview
polarisllm stop qwen2.5-7b-instruct    # Stop a model
polarisllm undeploy qwen2.5-7b-instruct # Remove completely

Status output:

$ polarisllm status
๐Ÿ“Š PolarisLLM System Status
============================================================
๐Ÿ–ฅ๏ธ  Server Status:
   Status: ๐ŸŸข Running (PID: 12345)
   Memory: 2.1%
   CPU: 0.5%
   API: ๐ŸŸข Healthy
   URL: http://localhost:7860

๐Ÿค– Models Status:
   Total Models: 2
   Running: 2 ๐ŸŸข
   Stopped: 0 ๐Ÿ”ด
   Detailed Status:
     qwen2.5-7b-instruct: ๐ŸŸข running
       Port: 8000, Memory: 15.2%, Uptime: 2.5h
     deepseek-coder-6.7b: ๐ŸŸข running
       Port: 8001, Memory: 12.8%, Uptime: 1.2h

๐Ÿ’พ Resource Status:
   Ports: 2/100 used (98 available)
   Range: 8000-8100
   Total Memory: 28.0% (all models combined)

๐Ÿ’ก Quick Commands:
   polarisllm deploy --model <name>     # Deploy a model
   polarisllm list                      # List all models
   polarisllm logs <model> --follow     # View live logs
   polarisllm stop <model>              # Stop a model
   polarisllm start --daemon            # Start server in background

Watch Logs

polarisllm logs qwen2.5-7b-instruct --follow    # Live logs
polarisllm logs --server --follow               # Server logs

Sample log output:

๐Ÿ“ Logs for model: qwen2.5-7b-instruct
   Lines: 100
   Follow: Yes
============================================================
๐Ÿ”„ Streaming logs (Press Ctrl+C to stop)...

[INFO:swift] Successfully registered model
[INFO:swift] rank: -1, local_rank: -1, world_size: 1
[INFO:swift] Loading the model using model_dir: /cache/Qwen2___5-7B-Instruct
[INFO:swift] Loading model weights...
[INFO:swift] Model loaded successfully
[INFO:swift] Server started on http://0.0.0.0:8000
[INFO:swift] Waiting for requests...

Server Control

polarisllm start --daemon     # Start in background
polarisllm stop --server      # Stop server
polarisllm restart           # Restart everything

๐Ÿค– Available Models

Popular Shortcuts:

  • qwen2.5-7b-instruct - Great all-around chat model
  • qwen2.5-14b-instruct - Larger version for better responses
  • deepseek-coder-6.7b - Excellent for programming
  • deepseek-vl-7b-chat - Understands images and text
  • mistral-7b-instruct - Fast and efficient
  • llama3.1-8b-instruct - Meta's latest model

Categories:

  • Chat: General conversation and Q&A
  • Code: Programming and development help
  • Vision: Image understanding and analysis
  • Audio: Speech and sound processing

See all 300+ models: python -m swift list-models


๐Ÿ”Œ Use with Your Code

Python

import openai

client = openai.OpenAI(
    base_url="http://localhost:7860/v1",
    api_key="not-required"
)

response = client.chat.completions.create(
    model="qwen2.5-7b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Sample response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "qwen2.5-7b-instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! I'm an AI assistant powered by PolarisLLM. How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 20,
    "total_tokens": 29
  }
}

JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
    baseURL: 'http://localhost:7860/v1',
    apiKey: 'not-required'
});

const completion = await client.chat.completions.create({
    model: 'qwen2.5-7b-instruct',
    messages: [{ role: 'user', content: 'Hello!' }]
});

cURL

curl -X POST "http://localhost:7860/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-7b-instruct",
    "messages": [{"role": "user", "content": "Write a Python function to add two numbers"}]
  }'

Sample cURL response:

{
  "id": "chatcmpl-456",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "qwen2.5-7b-instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Here's a simple Python function to add two numbers:\n\n```python\ndef add_numbers(a, b):\n    return a + b\n\n# Example usage\nresult = add_numbers(5, 3)\nprint(result)  # Output: 8\n```"
    },
    "finish_reason": "stop"
  }]
}

๐Ÿ“Š See What's Running

$ polarisllm list
๐Ÿ“‹ Deployed Models
========================================================================
NAME                    STATUS      PORT    MEMORY   UPTIME    TYPE
qwen2.5-7b-instruct     ๐ŸŸข Running  8000    15.2%    2.5h      qwen2_5
deepseek-coder-6.7b     ๐ŸŸข Running  8001    12.8%    1.2h      deepseek
mistral-7b-instruct     ๐Ÿ”ด Stopped  8002    N/A      N/A       mistral
$ polarisllm status
๐Ÿ“Š System Status
===========================================
๐Ÿ–ฅ๏ธ  Server: ๐ŸŸข Running at http://localhost:7860
๐Ÿค– Models: 2 running, 1 stopped
๐Ÿ’พ Resources: 2/100 ports used, 28% memory

๐Ÿšซ Fix Common Issues

Model won't start?

polarisllm logs <model-name>    # Check what went wrong
polarisllm cleanup              # Clean up any stuck processes

Sample error log:

๐Ÿ“ Logs for model: qwen2.5-7b-instruct
============================================================
[ERROR:swift] CUDA out of memory. Tried to allocate 2.0 GiB
[INFO:swift] Try reducing batch size or using a smaller model
[ERROR:swift] Model loading failed

Server not working?

polarisllm status               # See what's happening
polarisllm restart              # Restart everything

Need to free up space?

polarisllm stop --all           # Stop all models
polarisllm cleanup              # Clean up old processes

Sample cleanup output:

๐Ÿงน Cleaning up PolarisLLM...
========================================
๐Ÿ” Cleaning up dead processes...
๐Ÿ” Cleaning up dead models...
๐Ÿ” Cleaning up port allocations...
Cleaned up 2 dead port allocations
๐Ÿ” Cleaning up old logs...
Cleaned up 3 old log files

โœ… Cleanup completed!
๐Ÿ’ก Use 'polarisllm status' to verify system state

๐Ÿ’ก Pro Tips

  • Models run in background automatically - they survive terminal restarts
  • Use --follow with logs to watch models start up in real-time
  • Each model gets its own port (8000, 8001, 8002...)
  • Server remembers your models even after restarts
  • Use shortcuts for popular models, full names for everything else

๐Ÿค Need Help?


๐Ÿ“„ License

MIT License - Free to use for any purpose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polarisllm-2.0.1.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polarisllm-2.0.1-py3-none-any.whl (48.7 kB view details)

Uploaded Python 3

File details

Details for the file polarisllm-2.0.1.tar.gz.

File metadata

  • Download URL: polarisllm-2.0.1.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for polarisllm-2.0.1.tar.gz
Algorithm Hash digest
SHA256 166f0cc83f69fc7403f38586e2738bcd4243658d9eac471d075b7cd4df348752
MD5 5a8dbc232b2be0f0d824ee8a43a25c99
BLAKE2b-256 2bb1b05d2588841368c20df1c9b49e11aa869008dfbfc91bf38427cb51c6fdb2

See more details on using hashes here.

File details

Details for the file polarisllm-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: polarisllm-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 48.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for polarisllm-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cb38575ba4c50495c6362d57b8610bf530b37c54e186e30ed3ff93a4fcea44c0
MD5 bb171026920b0bdd7fd3508b24b2f67d
BLAKE2b-256 3a8f7942b0edaf7776581e864e0ad3cc838d0ea3d9d1eff352689ddde06ed86a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page