GPU-Accelerated LLM Terminal for Apple Silicon

These details have not been verified by PyPI

Project links

Project description

Cortex

GPU-accelerated local LLMs on Apple Silicon, built for the terminal.

Cortex preview

Cortex is a fast, native CLI for running and fine-tuning LLMs on Apple Silicon using MLX and Metal. It automatically detects chat templates, supports multiple model formats, and keeps your workflow inside the terminal.

Highlights

Apple Silicon GPU acceleration via MLX (primary) and PyTorch MPS
Multi-format model support: MLX, GGUF, SafeTensors, PyTorch, GPTQ, AWQ
Built-in LoRA fine-tuning wizard
Chat template auto-detection (ChatML, Llama, Alpaca, Gemma, Reasoning)
Conversation history with autosave and export

Quick Start

pipx install cortex-llm
cortex

Inside Cortex:

/download to fetch a model from HuggingFace
/model to load or manage models
/status to confirm GPU acceleration and current settings

Installation

Option A: pipx (recommended)

pipx install cortex-llm

Option B: from source

git clone https://github.com/faisalmumtaz/Cortex.git
cd Cortex
./install.sh

The installer checks Apple Silicon compatibility, creates a venv, installs dependencies from pyproject.toml, and sets up the cortex command.

Requirements

Apple Silicon Mac (M1/M2/M3/M4)
macOS 13.3+
Python 3.11+
16GB+ unified memory (24GB+ recommended for larger models)
Xcode Command Line Tools

Model Support

Cortex supports:

MLX (recommended)
GGUF (llama.cpp + Metal)
SafeTensors
PyTorch (Transformers + MPS)
GPTQ / AWQ quantized models

Advanced Features

Dynamic quantization fallback for PyTorch/SafeTensors models that do not fit GPU memory (INT8 preferred, INT4 fallback)
- docs/dynamic-quantization.md
MLX conversion with quantization recipes (4/5/8-bit, mixed precision) for speed vs quality control
- docs/mlx-acceleration.md
LoRA fine-tuning wizard for local adapters (/finetune)
- docs/fine-tuning.md
Template registry and auto-detection for chat formatting (ChatML, Llama, Alpaca, Gemma, Reasoning)
- docs/template-registry.md
Inference engine details and backend behavior
- docs/inference-engine.md
Tooling (experimental, WIP) for repo-scoped read/search and optional file edits with explicit confirmation
- docs/cli.md

Important (Work in Progress): Tooling is actively evolving and should be considered experimental. Behavior, output format, and available actions may change; tool calls can fail; and UI presentation may be adjusted. Use tooling on non-critical work first, and always review any proposed file changes before approving them.

Configuration

Cortex reads config.yaml from the current working directory. For tuning GPU memory limits, quantization defaults, and inference parameters, see:

docs/configuration.md

Documentation

Start here:

docs/installation.md
docs/cli.md
docs/model-management.md
docs/troubleshooting.md

Advanced topics:

docs/mlx-acceleration.md
docs/inference-engine.md
docs/dynamic-quantization.md
docs/template-registry.md
docs/fine-tuning.md
docs/development.md

Contributing

Contributions are welcome. See docs/development.md for setup and workflow.

License

MIT License. See LICENSE.

Note: Cortex requires Apple Silicon. Intel Macs are not supported.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.20

Feb 4, 2026

1.0.19

Feb 4, 2026

1.0.18

Feb 4, 2026

1.0.17

Feb 4, 2026

This version

1.0.16

Feb 4, 2026

1.0.15

Feb 4, 2026

1.0.14

Feb 4, 2026

1.0.13

Feb 4, 2026

1.0.12

Feb 4, 2026

1.0.11

Feb 3, 2026

1.0.10

Feb 2, 2026

1.0.9

Feb 2, 2026

1.0.8

Feb 2, 2026

1.0.7

Feb 2, 2026

1.0.6

Feb 1, 2026

1.0.5

Feb 1, 2026

1.0.4

Feb 1, 2026

1.0.3

Feb 1, 2026

1.0.2

Feb 1, 2026

1.0.1

Feb 1, 2026

1.0.0

Feb 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cortex_llm-1.0.16.tar.gz (164.5 kB view details)

Uploaded Feb 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cortex_llm-1.0.16-py3-none-any.whl (184.4 kB view details)

Uploaded Feb 4, 2026 Python 3

File details

Details for the file cortex_llm-1.0.16.tar.gz.

File metadata

Download URL: cortex_llm-1.0.16.tar.gz
Upload date: Feb 4, 2026
Size: 164.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cortex_llm-1.0.16.tar.gz
Algorithm	Hash digest
SHA256	`5380435d79558048fe749a959595272901944612571bdf8df8fa8c1fbc3f1f8d`
MD5	`f57e215404565f0144dd17423ed4938a`
BLAKE2b-256	`212a59710188101ae8dc3f9aa81d2ad834d6c8cc31625d64992376a43cd54fef`

See more details on using hashes here.

File details

Details for the file cortex_llm-1.0.16-py3-none-any.whl.

File metadata

Download URL: cortex_llm-1.0.16-py3-none-any.whl
Upload date: Feb 4, 2026
Size: 184.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for cortex_llm-1.0.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c61f4a48623da481532230c817272af4cda0fab348ffd2f2aa321f1d68b40413`
MD5	`9f73d9a362f26fa55bff80e86a3f21fd`
BLAKE2b-256	`83ffe57c87b449656f8f8dccd3fde39c313a42a5ab4e441aad9d1ad9f8ea5b85`

See more details on using hashes here.

cortex-llm 1.0.16

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cortex

Highlights

Quick Start

Installation

Option A: pipx (recommended)

Option B: from source

Requirements

Model Support

Advanced Features

Configuration

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes