Skip to main content

High-performance MLX implementation of Manifold-Constrained Hyper-Connections (mHC)

Project description

mhc-mlx

High-performance MLX implementation of Manifold-Constrained Hyper-Connections (mHC) for Apple Silicon.

This library provides a drop-in MHCLayer that fuses multiple operations into optimized Metal kernels, achieving massive speedups over compiled reference layers and standard Python-based implementations.

Original Paper: mHC: Manifold-Constrained Hyper-Connections (DeepSeek-AI)

Installation

Install from PyPI:

pip install mhc-mlx

Quick Start

Option 1: Drop-in Layer (Recommended)

Use MHCLayer for maximum performance.

import mlx.core as mx
from mhc_mlx import MHCLayer

layer = MHCLayer(n=32, C=64) # 32 streams, 64 channels each
x = mx.random.normal((1, 32, 64))
y = layer(x)

Option 2: Universal Wrapper (MHCRewire)

Enhance any existing MLX module (Linear, Conv2d, Transformers) with manifold-constrained stability. Note: optimizing arbitrary modules incurs some overhead compared to the fused MHCLayer.

import mlx.nn as nn
from mhc_mlx import MHCRewire

# Wrap a standard Linear layer
layer = MHCRewire(nn.Linear(512, 512), dims=512, n=16)

Performance

We benchmarked on an Apple M4 Pro (macOS 15.6). mhc-mlx outperforms standard implementations across all scales.

Head-to-Head: mhc-mlx vs mlx-mhc (Competitor)

Scenario mhc-mlx (ours) mlx-mhc (them) Speedup
Latency ($B=1, C=512$) 392 us 1120 us 2.86x
Throughput ($B=32, C=512$) 105 us 866 us 8.25x

Why We're Faster

Implementation Characteristics Performance Impact
Python / JIT Many small kernel launches Higher overhead, low occupancy
Fused Metal 1-3 highly optimized kernels Minimal overhead, maximum bandwidth

Latency Floor ($B=1$, Sequence Length=32)

Channels (C) Kernel Strategy Layer Speedup (vs Compiled MLX)
256 Fully Fused 2.27x
1024 Fully Fused 1.57x
2048 Fully Fused 1.58x
4096 Column Parallel 1.41x
8192 Column Parallel 2.18x

Key Optimizations

  • Fully Fused Kernel: Single kernel for Aggregate + RMS + Mix + Add.
  • Column-Parallel Mixing: Vectorized kernel maximizing throughput for larger workloads.
  • Adaptive Dispatch: Runtime heuristic selects the fastest kernel strategy.
  • Super-Fused Backward: Fused gradients for maximum training efficiency.

Troubleshooting

Run diagnostics to check your environment:

mhc-mlx-info

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mhc_mlx-0.4.2.tar.gz (38.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mhc_mlx-0.4.2-py3-none-any.whl (58.3 kB view details)

Uploaded Python 3

File details

Details for the file mhc_mlx-0.4.2.tar.gz.

File metadata

  • Download URL: mhc_mlx-0.4.2.tar.gz
  • Upload date:
  • Size: 38.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mhc_mlx-0.4.2.tar.gz
Algorithm Hash digest
SHA256 9ce86224c8a5d2ae416300374287e6144846eed4db1a7318ba6ead3b89e450f1
MD5 51f542bbc19890d87bf679b46571a490
BLAKE2b-256 504fca9f2cf02ef1065d93177ecf52055faf646978179318e00b5297dcb2650c

See more details on using hashes here.

Provenance

The following attestation bundles were made for mhc_mlx-0.4.2.tar.gz:

Publisher: publish.yml on svdrecbd/mhc-mlx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mhc_mlx-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: mhc_mlx-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 58.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mhc_mlx-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3aebc9fc139fbf3fa1b7ee42e65f576d9d5e4ba56422b10e6d1e62eef4180cf2
MD5 61d1bf63112dc0b0eac27e024aa67e89
BLAKE2b-256 42e929c614f14cffb02745309624ddda5b92d83aad64f7993e22337b549830bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for mhc_mlx-0.4.2-py3-none-any.whl:

Publisher: publish.yml on svdrecbd/mhc-mlx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page