๐ The Ultimate Multi-Model LLM Runtime Platform - Deploy, manage, and serve 300+ language models with OpenAI-compatible APIs. Built on ms-swift for production-ready performance.
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
๐ PolarisLLM - AI Model Orchestration Platform
Deploy and manage 300+ AI models with simple commands
Transform your server into a powerful AI platform. Deploy models in the background, manage them with ease, and access everything through OpenAI-compatible APIs.
โจ What You Get
๐ Background Model Deployment - Models run automatically in the background
๐๏ธ Simple Management - Start, stop, and monitor with easy commands
๐ Real-time Monitoring - See status, memory usage, and live logs
๐ OpenAI Compatible - Works with existing OpenAI code
๐ 300+ Models - Qwen, Llama, DeepSeek, Mistral, and more
๐ Get Started
Install
pip install polarisllm --upgrade
Start Server
polarisllm start --daemon
๐ PolarisLLM Runtime Engine
==================================================
๐ Starting PolarisLLM server in daemon mode...
Host: 0.0.0.0
Port: 7860
Log File: /home/user/.polarisllm/logs/server.log
โ
Server started successfully!
PID: 12345
URL: http://0.0.0.0:7860
๐ก Commands:
polarisllm status # Check server status
polarisllm logs --server # View server logs
polarisllm stop --server # Stop server
Deploy Your First Model
polarisllm deploy --model qwen2.5-7b-instruct
๐ Deploying model: qwen2.5-7b-instruct
๐ Using convenience shortcut for qwen2.5-7b-instruct
Model Type: qwen2_5
Model ID: Qwen/Qwen2.5-7B-Instruct
๐ก Allocated port: 8000
๐ง Command: swift deploy --model_type qwen2_5 --model Qwen/Qwen2.5-7B-Instruct --port 8000 --host 0.0.0.0
๐ Logs: /home/user/.polarisllm/logs/qwen2.5-7b-instruct.log
๐ Starting deployment in background...
โ
Started process 12346 for qwen2.5-7b-instruct
โ
Model deployment started successfully!
Name: qwen2.5-7b-instruct
PID: 12346
Port: 8000
Status: Initializing...
๐ Monitor with: polarisllm logs qwen2.5-7b-instruct --follow
๐ Check status: polarisllm status
๐ Access via: http://localhost:7860/v1/chat/completions
Check What's Running
polarisllm list
๐ Deployed Models
========================================================================
NAME STATUS PORT MEMORY UPTIME TYPE
qwen2.5-7b-instruct ๐ข Running 8000 15.2% 2.5h qwen2_5
๐ Summary:
Total Models: 1
Running: 1
Stopped: 0
๐ก Commands:
polarisllm logs qwen2.5-7b-instruct --follow # View live logs
polarisllm stop qwen2.5-7b-instruct # Stop a model
polarisllm status # Detailed status
๐ฎ Common Commands
Deploy Models
# Popular models (shortcuts available)
polarisllm deploy --model qwen2.5-7b-instruct
polarisllm deploy --model deepseek-coder-6.7b
polarisllm deploy --model mistral-7b-instruct
# Any model with full name
polarisllm deploy --model my-llama \
--model-type llama3_1 \
--model-id meta-llama/Meta-Llama-3.1-8B-Instruct
Sample deployment output:
๐ Deploying model: deepseek-coder-6.7b
๐ Using convenience shortcut for deepseek-coder-6.7b
Model Type: deepseek
Model ID: deepseek-ai/deepseek-coder-6.7b-instruct
๐ก Allocated port: 8001
๐ Starting deployment in background...
โ
Model deployed successfully on port 8001!
Manage Your Models
polarisllm list # See all models
polarisllm status # System overview
polarisllm stop qwen2.5-7b-instruct # Stop a model
polarisllm undeploy qwen2.5-7b-instruct # Remove completely
Status output:
$ polarisllm status
๐ PolarisLLM System Status
============================================================
๐ฅ๏ธ Server Status:
Status: ๐ข Running (PID: 12345)
Memory: 2.1%
CPU: 0.5%
API: ๐ข Healthy
URL: http://localhost:7860
๐ค Models Status:
Total Models: 2
Running: 2 ๐ข
Stopped: 0 ๐ด
Detailed Status:
qwen2.5-7b-instruct: ๐ข running
Port: 8000, Memory: 15.2%, Uptime: 2.5h
deepseek-coder-6.7b: ๐ข running
Port: 8001, Memory: 12.8%, Uptime: 1.2h
๐พ Resource Status:
Ports: 2/100 used (98 available)
Range: 8000-8100
Total Memory: 28.0% (all models combined)
๐ก Quick Commands:
polarisllm deploy --model <name> # Deploy a model
polarisllm list # List all models
polarisllm logs <model> --follow # View live logs
polarisllm stop <model> # Stop a model
polarisllm start --daemon # Start server in background
Watch Logs
polarisllm logs qwen2.5-7b-instruct --follow # Live logs
polarisllm logs --server --follow # Server logs
Sample log output:
๐ Logs for model: qwen2.5-7b-instruct
Lines: 100
Follow: Yes
============================================================
๐ Streaming logs (Press Ctrl+C to stop)...
[INFO:swift] Successfully registered model
[INFO:swift] rank: -1, local_rank: -1, world_size: 1
[INFO:swift] Loading the model using model_dir: /cache/Qwen2___5-7B-Instruct
[INFO:swift] Loading model weights...
[INFO:swift] Model loaded successfully
[INFO:swift] Server started on http://0.0.0.0:8000
[INFO:swift] Waiting for requests...
Server Control
polarisllm start --daemon # Start in background
polarisllm stop --server # Stop server
polarisllm restart # Restart everything
๐ค Available Models
Popular Shortcuts:
qwen2.5-7b-instruct- Great all-around chat modelqwen2.5-14b-instruct- Larger version for better responsesdeepseek-coder-6.7b- Excellent for programmingdeepseek-vl-7b-chat- Understands images and textmistral-7b-instruct- Fast and efficientllama3.1-8b-instruct- Meta's latest model
Categories:
- Chat: General conversation and Q&A
- Code: Programming and development help
- Vision: Image understanding and analysis
- Audio: Speech and sound processing
See all 300+ models: python -m swift list-models
๐ Use with Your Code
Python
import openai
client = openai.OpenAI(
base_url="http://localhost:7860/v1",
api_key="not-required"
)
response = client.chat.completions.create(
model="qwen2.5-7b-instruct",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Sample response:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "qwen2.5-7b-instruct",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm an AI assistant powered by PolarisLLM. How can I help you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 20,
"total_tokens": 29
}
}
JavaScript
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://localhost:7860/v1',
apiKey: 'not-required'
});
const completion = await client.chat.completions.create({
model: 'qwen2.5-7b-instruct',
messages: [{ role: 'user', content: 'Hello!' }]
});
cURL
curl -X POST "http://localhost:7860/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-7b-instruct",
"messages": [{"role": "user", "content": "Write a Python function to add two numbers"}]
}'
Sample cURL response:
{
"id": "chatcmpl-456",
"object": "chat.completion",
"created": 1677652288,
"model": "qwen2.5-7b-instruct",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Here's a simple Python function to add two numbers:\n\n```python\ndef add_numbers(a, b):\n return a + b\n\n# Example usage\nresult = add_numbers(5, 3)\nprint(result) # Output: 8\n```"
},
"finish_reason": "stop"
}]
}
๐ See What's Running
$ polarisllm list
๐ Deployed Models
========================================================================
NAME STATUS PORT MEMORY UPTIME TYPE
qwen2.5-7b-instruct ๐ข Running 8000 15.2% 2.5h qwen2_5
deepseek-coder-6.7b ๐ข Running 8001 12.8% 1.2h deepseek
mistral-7b-instruct ๐ด Stopped 8002 N/A N/A mistral
$ polarisllm status
๐ System Status
===========================================
๐ฅ๏ธ Server: ๐ข Running at http://localhost:7860
๐ค Models: 2 running, 1 stopped
๐พ Resources: 2/100 ports used, 28% memory
๐ซ Fix Common Issues
Model won't start?
polarisllm logs <model-name> # Check what went wrong
polarisllm cleanup # Clean up any stuck processes
Sample error log:
๐ Logs for model: qwen2.5-7b-instruct
============================================================
[ERROR:swift] CUDA out of memory. Tried to allocate 2.0 GiB
[INFO:swift] Try reducing batch size or using a smaller model
[ERROR:swift] Model loading failed
Server not working?
polarisllm status # See what's happening
polarisllm restart # Restart everything
Need to free up space?
polarisllm stop --all # Stop all models
polarisllm cleanup # Clean up old processes
Sample cleanup output:
๐งน Cleaning up PolarisLLM...
========================================
๐ Cleaning up dead processes...
๐ Cleaning up dead models...
๐ Cleaning up port allocations...
Cleaned up 2 dead port allocations
๐ Cleaning up old logs...
Cleaned up 3 old log files
โ
Cleanup completed!
๐ก Use 'polarisllm status' to verify system state
๐ก Pro Tips
- Models run in background automatically - they survive terminal restarts
- Use
--followwith logs to watch models start up in real-time - Each model gets its own port (8000, 8001, 8002...)
- Server remembers your models even after restarts
- Use shortcuts for popular models, full names for everything else
๐ค Need Help?
- GitHub Issues: Report bugs or request features
- PyPI Package: Install from here
๐ License
MIT License - Free to use for any purpose
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polarisllm-2.0.2.tar.gz.
File metadata
- Download URL: polarisllm-2.0.2.tar.gz
- Upload date:
- Size: 45.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55fcb97a3ccccf434738171f107f23bfc18de4fc80ed302503357ab745cfabcf
|
|
| MD5 |
cbbeadd8e0a0a9d1afa5e024653c5fc2
|
|
| BLAKE2b-256 |
289b7097bcf5fb1f2e013d156c7bd5a81c90c3e1b220814732a216f662155e4d
|
File details
Details for the file polarisllm-2.0.2-py3-none-any.whl.
File metadata
- Download URL: polarisllm-2.0.2-py3-none-any.whl
- Upload date:
- Size: 48.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee24d71eb0679cf36aafeb50586e01fdc6305f76ec5e7d2d4d5c345cdb697e57
|
|
| MD5 |
231e8278d60b116e7d2881610abdf223
|
|
| BLAKE2b-256 |
80694d7e97431d5b01fcf7683a465c6cccdbf713a360d5f71b474a1ba87fbc31
|