Self-hosted ML platform for small AI teams. One install, full job lifecycle, real-time GPU monitoring.
Project description
Apex
Self-hosted ML platform for small AI teams.
Your GPU. Your team. No cloud tax.
Apex is a self-hosted ML platform for teams that own their GPU. One pip install,
one command, your browser opens to a full job queue, live GPU monitoring, and
browser-native VS Code — all running on a single workstation.
What you get
- Job queue with Docker execution — submit training jobs from the UI or API, scheduler picks them up, runs them inside Docker containers with full GPU access, streams logs back over WebSocket
- Real-time GPU telemetry — GPU util, VRAM, temperature, power draw, CPU, RAM, pushed to the dashboard every 2 seconds via server-sent events
- Browser-native VS Code — launch a
code-serverdev session in a container with your workspace pre-mounted; one click and you're coding inside the GPU - Job history + logs — sortable table of every job, filter by status, tail logs from anywhere, cancel running jobs
- Pre-built images —
apex/code-server:pythonandapex/code-server:pytorch(CUDA 12.4 + PyTorch 2.4) bundled as working examples - No Kubernetes. No cloud bill. No DevOps engineer.
Quickstart
pip install apex-ml
apex start
That's it. Your browser opens to http://localhost:7000.
On first run, Apex prints a one-time login to the terminal:
=== First-run owner account created ===
email: owner@apex.local
password: <random>
Save this — it will not be shown again.
Log in with those credentials. The password is only shown once.
Requirements
- Python 3.10+
- Docker daemon running (used to execute jobs and dev sessions)
- NVIDIA GPU (optional — Apex degrades gracefully to CPU-only if no GPU is present)
- Linux or macOS (tested on Ubuntu 22.04 + macOS 14)
Build the base images (one-time)
apex build-images # Python image (~2 min)
apex build-images --pytorch # + PyTorch/CUDA image (~15 min, ~8 GB)
Submit your first job
curl -X POST http://localhost:7000/api/jobs \
-H 'Content-Type: application/json' \
-d '{
"name": "hello-world",
"image": "apex/code-server:python",
"script": "python -c \"print(\\\"hello from apex\\\")\"",
"gpu_count": 0,
"priority": "normal"
}'
Or use the UI — click + Submit job on the dashboard.
Custom images
You can use any Docker image for training jobs:
docker build -t my-team/training-image .
# then specify my-team/training-image when submitting a job
For dev sessions (browser VS Code), the image must have code-server installed.
Build on top of the official base image:
FROM apex/code-server:python
RUN pip install torch transformers
Or use codercom/code-server:latest directly.
Train CIFAR-10 in 60 seconds
# Build the PyTorch image (one-time, ~8 GB)
git clone https://github.com/apexhq-dev/apex && cd apex
docker build -t apex/code-server:pytorch -f docker/pytorch.Dockerfile docker/
# Drop the training script into your workspace
cp runway-workspace/cifar10_train.py ~/apex-workspace/
# Submit the job via the UI or curl
# Image: apex/code-server:pytorch
# Script: python /workspace/cifar10_train.py --epochs 2
# GPU: 1 GPU
The training job hits ~66% test accuracy on an NVIDIA L4 in about 30 seconds.
Screenshots
| Overview dashboard — live GPU metrics, stat cards, submit form, job list, active sessions | Live training logs — WebSocket stream of real CIFAR-10 training output |
| Job history — sortable, filterable, per-row logs + cancel/remove actions | Metrics — large GPU + CPU chart with 8-cell live readings grid |
| Dev sessions — browser VS Code in one click, workspace pre-mounted | Docker images — reads directly from the host daemon, no registry push |
The math
Compare a full month of training compute, 24/7:
| Option | Price | Math |
|---|---|---|
| RunPod RTX 4090 | $316 / mo | $0.44/hr × 720 hrs |
| Lambda H100 | $2,150 / mo | $2.99/hr × 720 hrs |
| AWS p4d.24xlarge (8× A100) | $23,594 / mo | $32.77/hr × 720 hrs |
| Your workstation + Apex | $29 / mo | Team tier, 8 seats, unlimited jobs |
Pricing
| Tier | Price | Gets you |
|---|---|---|
| Free | $0 forever | Full feature set, 1 seat, unlimited jobs, community support |
| Team | $29/mo flat | Everything in Free + 8 seats, multi-user auth, audit log, SSO, priority support |
| Hosted | $99/mo + GPU | We run Apex for you on a rented GPU, SLA, backups |
All tiers run the same codebase. Free and Team are self-hosted on your hardware.
Architecture
Intentionally boring. One Python process, one pip install, one GPU machine.
apex/
├── cli.py # click CLI: start, stop, status
├── config.py # loads ~/.apex/config.json
├── docker_mgr.py # docker SDK wrapper (create, start, clean up)
├── monitor/ # GPU (pynvml) + CPU (psutil) collector thread
├── scheduler/ # SQLite-backed queue + worker thread
├── server/ # FastAPI app + routes
│ ├── app.py # app factory
│ ├── auth.py # JWT + bcrypt
│ ├── db.py # SQLite schema + connection helper
│ └── routes/ # metrics, jobs, sessions, images, users
└── static/ # Vanilla HTML/CSS/JS — no build step
Stack: FastAPI · Uvicorn · SQLite · pynvml · docker-py · SSE-Starlette · code-server · vanilla JS
Not in the stack: React Redis Postgres RabbitMQ Kubernetes Helm Node webpack
See SPEC.md for the full technical specification.
CLI
apex start [--host 0.0.0.0] [--port 7000] [--no-browser]
apex status
apex stop # Ctrl+C works fine, this is a stub
apex logs # pipe `apex start` output instead
apex build-images # build base container images (one-time)
apex config show # print current config
apex config set KEY VALUE # update a config value (workspace, port, host)
apex --version
Configuration
Apex reads ~/.apex/config.json on startup. Defaults:
{
"workspace_path": "~/apex-workspace",
"port": 7000,
"host": "0.0.0.0",
"session_port_range": [8080, 8200],
"jwt_secret": "<auto-generated on first run>"
}
View or change settings with the apex config command:
apex config show # print current config
apex config set workspace /mnt/nas/apex # change workspace path
apex config set port 8000 # change UI port
apex config set host 127.0.0.1 # bind to localhost only
Or use the APEX_WORKSPACE environment variable (useful for system-level config):
export APEX_WORKSPACE=/mnt/nas/apex
apex start
Shared workspace (NAS / network drives)
Every job and dev session mounts the workspace directory into the container at /workspace. Pointing multiple team members' machines at the same network share means scripts, datasets, and checkpoints are visible to everyone without manual copying.
# 1. Mount the shared drive at the OS level (NFS example)
sudo mount -t nfs 192.168.1.10:/shared /mnt/nas
# 2. Point Apex at it
apex config set workspace /mnt/nas/apex-workspace
# 3. Restart
apex stop && apex start
Samba (SMB) mounts work the same way — mount it at the OS level, then run apex config set workspace <path>. The workspace directory is created automatically if it does not exist.
API
| Route | Method | Description |
|---|---|---|
/api/health |
GET | {"ok": true} |
/api/metrics/stream |
GET (SSE) | Live GPU/CPU metrics every 2s |
/api/metrics/current |
GET | Single snapshot of current metrics |
/api/jobs |
GET | List jobs (filters: status, limit, offset) |
/api/jobs |
POST | Submit a new job |
/api/jobs/{id} |
GET | Job detail |
/api/jobs/{id} |
DELETE | Cancel running job or remove completed one |
/api/jobs/{id}/logs |
WS | Stream container logs line-by-line |
/api/sessions |
GET/POST/DELETE | List, launch, stop dev sessions |
/api/images |
GET | List Docker images from the host daemon |
/api/users/login |
POST | JWT login |
/api/users/me |
GET | Current user |
/api/users/invite |
POST | Invite a new team member (admin only) |
Development
git clone https://github.com/apexhq-dev/apex
cd apex
pip install -e .
apex start --skip-docker-check # dev mode without Docker
The frontend is plain HTML/CSS/JS in apex/static/ — edit and refresh, no build step.
License
Apache License 2.0 © 2026 Apex contributors
Community
- Website — tryapex.dev
- Discord — discord.gg/RFpDyhdpWJ
- Twitter — @apexhq_dev
- GitHub Issues — for bugs and feature requests
- GitHub Discussions — for questions and ideas
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file apex_ml-0.2.2.tar.gz.
File metadata
- Download URL: apex_ml-0.2.2.tar.gz
- Upload date:
- Size: 55.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f230a3d974c5d75adadfd9be92ea04d04d722d613addf9c8050f20692dcce0c4
|
|
| MD5 |
004601795bfc31bd1bf79f219396380d
|
|
| BLAKE2b-256 |
8c52ca19177b56faff103c1dd7d76f2f91a5a75242e30e49375c791307c0b04d
|
File details
Details for the file apex_ml-0.2.2-py3-none-any.whl.
File metadata
- Download URL: apex_ml-0.2.2-py3-none-any.whl
- Upload date:
- Size: 62.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afa4e586b3cd4cdce0339c15f5aff35cb934a4dd16b0720f8c44a80f12e117ee
|
|
| MD5 |
6f2ab2842a58a08f5e964537525e6a43
|
|
| BLAKE2b-256 |
328df05e56c829d59372bbca8376c1b01dbf21b61c0ab1bd67bc72ee5ac02a86
|