Offline AI coding assistant for Apple Silicon. Run LLMs locally with OpenAI-compatible API.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tumma72

These details have not been verified by PyPI

Project description

local-ai

Run AI models locally on your Mac with zero cloud dependencies.

local-ai brings the power of large language models to your Apple Silicon Mac, completely offline. No API keys, no usage limits, no data leaving your machine.

Why local-ai?

The Problem

Privacy concerns: Cloud AI services see all your code, prompts, and data
API costs: Pay-per-token pricing adds up quickly for heavy usage
Rate limits: Cloud providers throttle requests during peak times
Internet dependency: No connection = no AI assistance
Latency: Round-trip to cloud servers adds delay to every interaction

The Solution

local-ai runs models directly on your Mac's GPU using Apple's MLX framework:

100% Private: Your data never leaves your machine
Zero Cost: No API fees, subscriptions, or usage limits
Always Available: Works offline, on planes, in secure environments
Low Latency: Direct GPU inference, no network round-trips
OpenAI Compatible: Works with existing tools that support OpenAI's API

Features

One-Command Server: Start a local LLM server with local-ai server start
OpenAI-Compatible API: Drop-in replacement for OpenAI clients
Model Browser: Discover and download optimized MLX models from Hugging Face
Hardware Detection: Automatically detects your Mac's capabilities
Smart Recommendations: Suggests models that fit your available memory
Web Interface: Built-in chat UI for testing models at http://localhost:8080
Tool Calling: Function calling support for agentic workflows
Rich CLI: Beautiful terminal output with progress bars and status panels

Quick Start

Installation from GitHub

# Clone the repository
git clone https://github.com/tumma72/local-ai.git
cd local-ai

# Install with uv (recommended)
uv sync

# Or with pip
pip install -e .

Basic Usage

# Start the server (models load dynamically)
local-ai server start

# Open http://localhost:8080 in your browser for the web UI

# Or use with any OpenAI-compatible client
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/Qwen3-0.6B-4bit",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Discover Models

# See recommended models for your hardware
local-ai models recommend

# Search for specific models
local-ai models search "llama 8b"

# Get detailed model info
local-ai models info mlx-community/Llama-3.2-3B-Instruct-4bit

Server Management

# Check server status
local-ai server status

# View server logs
local-ai server logs --follow

# Restart with new settings
local-ai server restart --port 9000

# Stop the server
local-ai server stop

Configuration

Create a config.toml file for persistent settings:

[server]
host = "127.0.0.1"
port = 8080
log_level = "INFO"

[model]
# Default model (optional - models load dynamically)
path = "mlx-community/Qwen3-0.6B-4bit"

Use with CLI:

local-ai server start --config config.toml

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
Python 3.11+
8GB+ RAM recommended (16GB+ for larger models)

Use Cases

IDE Integration

local-ai works with any IDE that supports OpenAI-compatible endpoints:

VS Code with Continue extension
Cursor (set custom API endpoint)
Zed editor (configure assistant)
JetBrains IDEs with AI plugins

Claude Code / Aider / Other Tools

# Set environment variables
export OPENAI_API_BASE=http://localhost:8080/v1
export OPENAI_API_KEY=not-needed

# Use your favorite AI coding tool
aider --model mlx-community/Qwen3-0.6B-4bit

Python Integration

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="mlx-community/Qwen3-0.6B-4bit",
    messages=[{"role": "user", "content": "Explain Python decorators"}]
)
print(response.choices[0].message.content)

Development

# Install development dependencies
uv sync

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=local_ai

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

Apache License 2.0 - see LICENSE.md for details.

Acknowledgments

MLX - Apple's machine learning framework
mlx-omni-server - OpenAI-compatible server (with local patches)
Hugging Face - Model hosting and community

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tumma72

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0a0 pre-release

Dec 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_ai_server-0.2.0a0.tar.gz (129.8 kB view details)

Uploaded Dec 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

local_ai_server-0.2.0a0-py3-none-any.whl (90.7 kB view details)

Uploaded Dec 18, 2025 Python 3

File details

Details for the file local_ai_server-0.2.0a0.tar.gz.

File metadata

Download URL: local_ai_server-0.2.0a0.tar.gz
Upload date: Dec 18, 2025
Size: 129.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for local_ai_server-0.2.0a0.tar.gz
Algorithm	Hash digest
SHA256	`b2eda382c6b53006bb9bce7c2bed6a4315fc17252da56daa966c19235002ecdf`
MD5	`9739b38f792b13318efd0d96461126bc`
BLAKE2b-256	`fa9b6a1cee47f378cd394f62ac3b56a34b759e297980ab8ae21cfa7501f4e7ae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for local_ai_server-0.2.0a0.tar.gz:

Publisher: release-and-publish.yml on tumma72/local-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: local_ai_server-0.2.0a0.tar.gz
- Subject digest: b2eda382c6b53006bb9bce7c2bed6a4315fc17252da56daa966c19235002ecdf
- Sigstore transparency entry: 771489449
- Sigstore integration time: Dec 18, 2025
Source repository:
- Permalink: tumma72/local-ai@444b87fd3fdea1cbfe34d44b1800c7f5ff88643d
- Branch / Tag: refs/tags/v0.2.0a0
- Owner: https://github.com/tumma72
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-and-publish.yml@444b87fd3fdea1cbfe34d44b1800c7f5ff88643d
- Trigger Event: push

File details

Details for the file local_ai_server-0.2.0a0-py3-none-any.whl.

File metadata

Download URL: local_ai_server-0.2.0a0-py3-none-any.whl
Upload date: Dec 18, 2025
Size: 90.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for local_ai_server-0.2.0a0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a4ea1678f423ed9dc642f4e727cc0c08b59d6ef179bef9ebfa561fc5d168318d`
MD5	`6ec4a8b61afb89ee4f14d94663b7d250`
BLAKE2b-256	`a48f27eb213000484286275b6b9905f6739885a7206e8f37c0e1a188e272cf52`

See more details on using hashes here.

Provenance

The following attestation bundles were made for local_ai_server-0.2.0a0-py3-none-any.whl:

Publisher: release-and-publish.yml on tumma72/local-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: local_ai_server-0.2.0a0-py3-none-any.whl
- Subject digest: a4ea1678f423ed9dc642f4e727cc0c08b59d6ef179bef9ebfa561fc5d168318d
- Sigstore transparency entry: 771489450
- Sigstore integration time: Dec 18, 2025
Source repository:
- Permalink: tumma72/local-ai@444b87fd3fdea1cbfe34d44b1800c7f5ff88643d
- Branch / Tag: refs/tags/v0.2.0a0
- Owner: https://github.com/tumma72
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-and-publish.yml@444b87fd3fdea1cbfe34d44b1800c7f5ff88643d
- Trigger Event: push

local-ai-server 0.2.0a0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

local-ai

Why local-ai?

The Problem

The Solution

Features

Quick Start

Installation from GitHub

Basic Usage

Discover Models

Server Management

Configuration

Requirements

Use Cases

IDE Integration

Claude Code / Aider / Other Tools

Python Integration

Development

Contributing

License

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance