Skip to main content

GPU-Accelerated LLM Terminal for Apple Silicon

Project description

Cortex

GPU-accelerated local LLMs on Apple Silicon, built for the terminal.

Cortex is a fast, native CLI for running and fine-tuning LLMs on Apple Silicon using MLX and Metal. It automatically detects chat templates, supports multiple model formats, and keeps your workflow inside the terminal.

Highlights

  • Apple Silicon GPU acceleration via MLX (primary) and PyTorch MPS
  • Multi-format model support: MLX, GGUF, SafeTensors, PyTorch, GPTQ, AWQ
  • Built-in LoRA fine-tuning wizard
  • Chat template auto-detection (ChatML, Llama, Alpaca, Gemma, Reasoning)
  • Conversation history with branching

Quick Start

pipx install cortex-llm
cortex

Inside Cortex:

  • /download to fetch a model from HuggingFace
  • /model to load or manage models
  • /status to confirm GPU acceleration and current settings

Installation

Option A: pipx (recommended)

pipx install cortex-llm

Option B: from source

git clone https://github.com/faisalmumtaz/Cortex.git
cd Cortex
./install.sh

The installer checks Apple Silicon compatibility, creates a venv, installs dependencies from pyproject.toml, and sets up the cortex command.

Requirements

  • Apple Silicon Mac (M1/M2/M3/M4)
  • macOS 13.3+
  • Python 3.11+
  • 16GB+ unified memory (24GB+ recommended for larger models)
  • Xcode Command Line Tools

Model Support

Cortex supports:

  • MLX (recommended)
  • GGUF (llama.cpp + Metal)
  • SafeTensors
  • PyTorch (Transformers + MPS)
  • GPTQ / AWQ quantized models

Configuration

Cortex reads config.yaml from the current working directory. For tuning GPU memory limits, quantization defaults, and inference parameters, see:

  • docs/configuration.md

Documentation

Start here:

  • docs/installation.md
  • docs/cli.md
  • docs/model-management.md
  • docs/troubleshooting.md

Advanced topics:

  • docs/mlx-acceleration.md
  • docs/inference-engine.md
  • docs/template-registry.md
  • docs/fine-tuning.md
  • docs/development.md

Contributing

Contributions are welcome. See docs/development.md for setup and workflow.

License

MIT License. See LICENSE.


Note: Cortex requires Apple Silicon. Intel Macs are not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cortex_llm-1.0.6.tar.gz (149.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cortex_llm-1.0.6-py3-none-any.whl (163.0 kB view details)

Uploaded Python 3

File details

Details for the file cortex_llm-1.0.6.tar.gz.

File metadata

  • Download URL: cortex_llm-1.0.6.tar.gz
  • Upload date:
  • Size: 149.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cortex_llm-1.0.6.tar.gz
Algorithm Hash digest
SHA256 03303aa4cf4a79b802ec7b7eab1cc0a1bb09f104a1541955c332ace6e2586097
MD5 d5d34b2620ec6b87f6958f804f69654b
BLAKE2b-256 536fb00d966881c771181fe15a4a6d3253f480f7912beab01881353e0985a265

See more details on using hashes here.

File details

Details for the file cortex_llm-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: cortex_llm-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 163.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cortex_llm-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c949a4e8a15e9126334c5be7cf0a28a67d2261ea20bde5f000b21be5035bc209
MD5 decab631f48bee7990d5fbb104b63313
BLAKE2b-256 9143ba1e702ae56df3d054a7c77a2af91e124486fef1a434dbbc1594a8954c92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page