MLX Omni Server is a server that provides OpenAI-compatible APIs using Apple's MLX framework.

These details have not been verified by PyPI

Project links

Repository

Project description

MLX Omni Server

Local AI inference server optimized for Apple Silicon

MLX Omni Server Banner

MLX Omni Server provides dual API compatibility with both OpenAI and Anthropic APIs, enabling seamless local inference on Apple Silicon using the MLX framework.

Installation • Quick Start • Documentation • Contributing

✨ Features

🚀 Apple Silicon Optimized - Built on MLX framework for M1/M2/M3/M4 chips
🔌 Dual API Support - Compatible with both OpenAI and Anthropic APIs
🎯 Complete AI Suite - Chat, audio processing, image generation, embeddings
⚡ High Performance - Local inference with hardware acceleration
🔐 Privacy-First - All processing happens locally on your machine
🛠 Drop-in Replacement - Works with existing OpenAI and Anthropic SDKs

🚀 Installation

pip install mlx-omni-server

⚡ Quick Start

Start the server:
```
mlx-omni-server
```

Choose your preferred API:

OpenAI API (Click to expand)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:10240/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="mlx-community/gemma-3-1b-it-4bit-DWQ",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Anthropic API (Click to expand)

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:10240/anthropic",
    api_key="not-needed"
)

message = client.messages.create(
    model="mlx-community/gemma-3-1b-it-4bit-DWQ",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

🎉 That's it! You're now running AI locally on your Mac.

📋 API Support

OpenAI Compatible Endpoints (`/v1/*`)

Endpoint	Feature	Status
`/v1/chat/completions`	Chat with tools, streaming, structured output	✅
`/v1/audio/speech`	Text-to-Speech	✅
`/v1/audio/transcriptions`	Speech-to-Text	✅
`/v1/images/generations`	Image Generation	✅
`/v1/embeddings`	Text Embeddings	✅
`/v1/models`	Model Management	✅

Anthropic Compatible Endpoints (`/anthropic/v1/*`)

Endpoint	Feature	Status
`/anthropic/v1/messages`	Messages with tools, streaming, thinking mode	✅
`/anthropic/v1/models`	Model listing with pagination	✅

⚙️ Configuration

# Default (port 10240)
mlx-omni-server

# Custom options
mlx-omni-server --port 8000
MLX_OMNI_LOG_LEVEL=debug mlx-omni-server

# View all options
mlx-omni-server --help

🛠 Development

Development Setup

git clone https://github.com/madroidmaq/mlx-omni-server.git
cd mlx-omni-server
uv sync

# Start with hot-reload
uv run uvicorn mlx_omni_server.main:app --reload --host 0.0.0.0 --port 10240

Testing:

uv run pytest                    # All tests
uv run pytest tests/chat/openai/ # OpenAI tests
uv run pytest tests/chat/anthropic/ # Anthropic tests

Code Quality:

uv run black . && uv run isort . # Format code
uv run pre-commit run --all-files # Run hooks

🎯 Key Features

Model Management

Auto-discovery of MLX models in HuggingFace cache
On-demand loading and intelligent caching
Automatic model downloading when needed

Advanced Capabilities

Function calling with model-specific parsers
Real-time streaming for both APIs
JSON schema validation and structured output
Extended reasoning (thinking mode) for supported models

📚 Documentation

Resource	Description
OpenAI API Guide	Complete OpenAI API reference
Anthropic API Guide	Complete Anthropic API reference
Examples	Practical usage examples

🔍 Troubleshooting

Common Issues

Requirements:

Python 3.11+
Apple Silicon Mac (M1/M2/M3/M4)
MLX framework installed

Quick fixes:

# Check requirements
python --version  # Should be 3.11+
python -c "import mlx; print(mlx.__version__)"

# Pre-download models (if needed)
huggingface-cli download mlx-community/gemma-3-1b-it-4bit-DWQ

# Enable debug logging
MLX_OMNI_LOG_LEVEL=debug mlx-omni-server

🤝 Contributing

Quick contributor setup:

git clone https://github.com/madroidmaq/mlx-omni-server.git
cd mlx-omni-server
uv sync && uv run pytest

🙏 Acknowledgments

Built with MLX by Apple • FastAPI • MLX-LM

📄 License

MIT License • Not affiliated with OpenAI, Anthropic, or Apple

🌟 Star History

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

0.5.3

May 9, 2026

This version

0.5.2

Dec 21, 2025

0.5.1

Oct 14, 2025

0.5.0

Sep 2, 2025

0.4.9

Aug 20, 2025

0.4.8

Aug 18, 2025

0.4.7

Aug 15, 2025

0.4.6

Aug 14, 2025

0.4.5

Aug 14, 2025

0.4.4

Aug 13, 2025

0.4.3

May 28, 2025

0.4.2

May 20, 2025

0.4.1

May 14, 2025

0.4.0

May 14, 2025

0.3.6

May 12, 2025

0.3.5

Apr 9, 2025

0.3.4

Mar 16, 2025

0.3.3

Mar 2, 2025

0.3.2

Feb 5, 2025

0.3.1

Jan 6, 2025

0.3.0

Jan 4, 2025

0.2.1

Dec 19, 2024

0.2.0

Dec 16, 2024

0.1.2

Dec 7, 2024

0.1.1

Dec 5, 2024

0.1.0

Nov 30, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_omni_server-0.5.2.tar.gz (55.3 kB view details)

Uploaded Dec 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlx_omni_server-0.5.2-py3-none-any.whl (78.2 kB view details)

Uploaded Dec 21, 2025 Python 3

File details

Details for the file mlx_omni_server-0.5.2.tar.gz.

File metadata

Download URL: mlx_omni_server-0.5.2.tar.gz
Upload date: Dec 21, 2025
Size: 55.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for mlx_omni_server-0.5.2.tar.gz
Algorithm	Hash digest
SHA256	`7e621b469bfa8fb5c0fc7756bd29305d5bdb9173bcf747fef81a963909ebf7ba`
MD5	`56e128217cf3c2e7500232b19678b520`
BLAKE2b-256	`437aa172434c047a9383ae2ba655be4802d42d4796cc8435d4e768ae2add0890`

See more details on using hashes here.

File details

Details for the file mlx_omni_server-0.5.2-py3-none-any.whl.

File metadata

Download URL: mlx_omni_server-0.5.2-py3-none-any.whl
Upload date: Dec 21, 2025
Size: 78.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for mlx_omni_server-0.5.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f32fa660edc77bca90fdb2c42313fdbaa17618c78baef968cad868e8e23eb992`
MD5	`a59c22111c5b72e97c02591a0a572978`
BLAKE2b-256	`b07dd7d1eedd1417b3f6860d821fa9c6c1a1208dd209ea8da65d945a357109ad`

See more details on using hashes here.

mlx-omni-server 0.5.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MLX Omni Server

✨ Features

🚀 Installation

⚡ Quick Start

📋 API Support

OpenAI Compatible Endpoints (/v1/*)

Anthropic Compatible Endpoints (/anthropic/v1/*)

⚙️ Configuration

🛠 Development

🎯 Key Features

📚 Documentation

🔍 Troubleshooting

🤝 Contributing

🙏 Acknowledgments

📄 License

🌟 Star History

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

OpenAI Compatible Endpoints (`/v1/*`)

Anthropic Compatible Endpoints (`/anthropic/v1/*`)