High-performance OpenAI-compatible embedding server for Roo Code indexing on Apple Silicon

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

dmarkey

These details have not been verified by PyPI

Project description

roocode-code-indexer-macos

High-performance OpenAI-compatible embedding server for Roo Code codebase indexing on Apple Silicon.

Runs Qwen3 embedding models locally via MLX with optimized batched GPU inference — no API keys needed. Up to 5x faster than Ollama for the same models.

Quick Start

# From PyPI
uvx roocode-code-indexer-macos

# Or from GitHub (latest)
uvx --from git+https://github.com/dmarkey/roocode-code-indexer-macos roocode-code-indexer-macos

Or with the recommended 4B model:

uvx roocode-code-indexer-macos --model mlx-community/Qwen3-Embedding-4B-4bit-DWQ

Roo Code Configuration

In Roo Code settings, set the embedding provider to OpenAI-compatible and point it at:

http://localhost:8000/v1

Roo Code codebase indexing configuration

Example using mlx-community/Qwen3-Embedding-8B-4bit-DWQ (4096 dimensions). mlx-community/Qwen3-Embedding-4B-4bit-DWQ (2560 dimensions) is recommended for most use cases.

Available Models

Model	Embedding Dim	Size	Description
`mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ`	1024	~900MB	Fast, good for small codebases (default)
`mlx-community/Qwen3-Embedding-4B-4bit-DWQ`	2560	~2.5GB	Recommended — best speed/quality tradeoff
`mlx-community/Qwen3-Embedding-8B-4bit-DWQ`	4096	~4.5GB	Highest quality, slower

Models are downloaded automatically from HuggingFace on first use.

Benchmarks: MLX vs Ollama

Synthetic code-like text data (10 concurrent requests x 60 texts = 600 texts), simulating Roo Code's indexing burst pattern. Reproducible via python benchmark.py.

Cold cache (first indexing run)

Server	Wall time	texts/sec	vs Ollama
Ollama qwen3-embedding:0.6b	9.0s	67	baseline
Ollama qwen3-embedding:4b	50.9s	12	0.2x
Ollama qwen3-embedding:8b	errors	-	crashed under load
MLX Qwen3-Embedding-0.6B	1.8s	337	5x faster
MLX Qwen3-Embedding-4B	9.0s	67	5.7x faster
MLX Qwen3-Embedding-8B	15.2s	39	-

Warm cache (re-indexing with overlapping files)

Server	Wall time	texts/sec
MLX Qwen3-Embedding-0.6B	42ms	14,414
MLX Qwen3-Embedding-4B	95ms	6,340
MLX Qwen3-Embedding-8B	146ms	4,106

Ollama has no embedding cache — every re-index pays full inference cost.

Why is MLX faster?

Request coalescing — concurrent requests are merged into single GPU batches instead of queuing serially
Length-bucketed sub-batching — minimizes padding waste across variable-length texts
Compiled Metal kernels — mx.compile with fixed padding buckets for graph reuse
LRU embedding cache — repeated texts skip inference entirely
Last-token pooling — matches Qwen3-Embedding training objective

Environment Variables

Variable	Default	Description
`MODEL_NAME`	`mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ`	Default model
`PORT`	`8000`	Server port
`HOST`	`0.0.0.0`	Bind address
`MAX_BATCH_SIZE`	`1024`	Max texts per batch
`MAX_TEXT_LENGTH`	`8192`	Max tokens per text
`LOG_LEVEL`	`INFO`	Logging level
`MAX_LOADED_MODELS`	`2`	Max models in memory (reduce to 1 on 8GB machines)

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
Python 3.10+

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

dmarkey

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.2

Mar 24, 2026

0.1.1

Mar 24, 2026

0.1.0

Mar 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roocode_code_indexer_macos-0.1.2.tar.gz (124.7 kB view details)

Uploaded Mar 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

roocode_code_indexer_macos-0.1.2-py3-none-any.whl (14.6 kB view details)

Uploaded Mar 24, 2026 Python 3

File details

Details for the file roocode_code_indexer_macos-0.1.2.tar.gz.

File metadata

Download URL: roocode_code_indexer_macos-0.1.2.tar.gz
Upload date: Mar 24, 2026
Size: 124.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for roocode_code_indexer_macos-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`152eb08cdb1fe483daa8ad110837dffd996c983367b934b68b8dc17a0025a3dc`
MD5	`a30997f99a97fb08d0d85cef8de0cc68`
BLAKE2b-256	`4c1ddd936a0a122f451441195d62fbf523f84b6a6159200f9020d6a3c4431c83`

See more details on using hashes here.

Provenance

The following attestation bundles were made for roocode_code_indexer_macos-0.1.2.tar.gz:

Publisher: publish.yml on dmarkey/roocode-code-indexer-macos

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: roocode_code_indexer_macos-0.1.2.tar.gz
- Subject digest: 152eb08cdb1fe483daa8ad110837dffd996c983367b934b68b8dc17a0025a3dc
- Sigstore transparency entry: 1176583603
- Sigstore integration time: Mar 24, 2026
Source repository:
- Permalink: dmarkey/roocode-code-indexer-macos@e114b3b436c734acae855d7f862755025d808e93
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/dmarkey
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e114b3b436c734acae855d7f862755025d808e93
- Trigger Event: release

File details

Details for the file roocode_code_indexer_macos-0.1.2-py3-none-any.whl.

File metadata

Download URL: roocode_code_indexer_macos-0.1.2-py3-none-any.whl
Upload date: Mar 24, 2026
Size: 14.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for roocode_code_indexer_macos-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ec3e6c820eeae6d097591668abc5c83bb18b61009c28325bc5312a64320bc5f`
MD5	`3a2dc37d11c40d8b73a5307b28c144c0`
BLAKE2b-256	`d3b3c524c9e9cc759e88cf7870ff9950e72a37ce52988072a780247a49464a28`

See more details on using hashes here.

Provenance

The following attestation bundles were made for roocode_code_indexer_macos-0.1.2-py3-none-any.whl:

Publisher: publish.yml on dmarkey/roocode-code-indexer-macos

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: roocode_code_indexer_macos-0.1.2-py3-none-any.whl
- Subject digest: 9ec3e6c820eeae6d097591668abc5c83bb18b61009c28325bc5312a64320bc5f
- Sigstore transparency entry: 1176583771
- Sigstore integration time: Mar 24, 2026
Source repository:
- Permalink: dmarkey/roocode-code-indexer-macos@e114b3b436c734acae855d7f862755025d808e93
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/dmarkey
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e114b3b436c734acae855d7f862755025d808e93
- Trigger Event: release

roocode-code-indexer-macos 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

roocode-code-indexer-macos

Quick Start

Roo Code Configuration

Available Models

Benchmarks: MLX vs Ollama

Cold cache (first indexing run)

Warm cache (re-indexing with overlapping files)

Why is MLX faster?

Environment Variables

Requirements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance