Skip to main content

Tiny AutoEncoders for diffusion on Apple Silicon — live previews + low-memory decode for FLUX & SD.

Project description

mlx-taef

mlx-taef

PyPI version Python versions License: MIT

Tiny AutoEncoders for diffusion latents on Apple Silicon, in pure MLX.

mlx-taef is the first MLX port of the TAESD family — TAESD (SD1.x), TAESDXL (SDXL), TAEF1 (FLUX.1), TAEF2 (FLUX.2 Klein) — distilled mini-autoencoders that decode diffusion latents to RGB in milliseconds using a few-MB model instead of multi-GB full VAEs.

Use it for:

  • Live previews during long generations on Mac — TAEF1 decodes a 512×512 preview in ~183 ms and TAEF2 in ~258 ms on M1 Max (vs 2 s for the full VAE). See COMPARISON.md for the measured table and reproducer.
  • Low-memory fallbacks when the full VAE OOMs on 16 GB Macs (TAEF2 peaks at ~0.6 GB decode memory vs ~2.6 GB for the full FLUX.2 VAE on the same latent).
  • Quick latent inspection in notebooks and ML research.
import mlx.core as mx
from mlx_taef import TAEF2

taef = TAEF2.from_pretrained()              # downloads + converts on first call
img = taef.decode(latents)                  # NHWC float in [0, 1]
img_uint8 = taef.decode_image(latents)      # uint8 NHWC ready for PIL

Which library do I need?

You want live previews or low-memory FLUX decode? You're in the right place. mlx-taef decodes diffusion latents to RGB in ~260 ms (TAEF2) or ~185 ms (TAEF1) on M1 Max — vs ~2 seconds for the full VAE, with ~4× less peak memory. Drops into mflux via LivePreviewCallback.

You want FLUX generation itself to be faster on Apple Silicon? You want mlx-teacache — it skips redundant denoising steps when the schedule is cacheable (measured 1.46× on FLUX.1-dev at 25 steps).

You want both: faster generation AND live previews? Use them together — they compose cleanly. mflux 4-step Klein + TeaCache + TAEF2 previews = 1.30× wall-clock and 26% less peak memory vs vanilla.

Install

From PyPI:

pip install mlx-taef
# With the mflux preview callback:
pip install "mlx-taef[mflux]"

Or with uv:

uv add mlx-taef
# With mflux:
uv add "mlx-taef[mflux]"

Pin an exact version in a project that needs reproducibility:

pip install "mlx-taef==0.2.0"

Verify the install:

mlx-taef --help

Requires Python ≥ 3.11 and Apple Silicon (mlx itself is Apple-Silicon-only). Runtime install has zero PyTorch dependencytorch is dev-only and used solely for fixture generation in the test suite.

Variants

Variant latent_channels For HF source
TAESD 4 Stable Diffusion 1.x madebyollin/taesd
TAESDXL 4 Stable Diffusion XL madebyollin/taesdxl
TAEF1 16 FLUX.1 madebyollin/taef1
TAEF2 32 FLUX.2 Klein madebyollin/taef2

All four share one API.

Benchmarks

Side-by-side images + measured timings: see COMPARISON.md.

All numbers there come from scripts/run_showcase.py (subprocess-per-rep bench harness) and the committed _artifacts/showcase_report.json. Per-rep raw arrays are preserved so reviewers can see variance, not just summary stats.

The previous v0.1.x README claim — "~100 ms decode at 1024×1024, 50–100× faster than the full Flux VAE; ~1 GB peak vs ~9.6 GB" — was a same-process measurement under v0.1's tests/test_perf.py. v0.2.0 re-measures under subprocess-per-rep with per-condition memory caps; see COMPARISON.md for the honest replacement numbers.

mflux live previews

from mflux.models.flux2 import Flux2Klein
from mlx_taef.integrations.mflux import LivePreviewCallback

model = Flux2Klein.from_pretrained("4bit")
preview = LivePreviewCallback(
    flux=model,            # auto-extracts the Flux2VAE BN stats for exact color
    variant="taef2",
    every=5,
    save_to="preview.png",
    latent_height=32,      # 512 / 16
    latent_width=32,
)
model.callbacks.register(preview)
model.generate_image(
    prompt="a red apple on a wooden table",
    num_inference_steps=25,
    width=512,
    height=512,
    seed=42,
)

Passing flux=model lets the callback auto-extract model.vae.bn.running_mean and running_var so TAEF2 previews are color-correct out of the box (callback.resolved_bn == "auto"). If you have a custom integration where flux= isn't convenient, pass bn_mean= and bn_var= explicitly — those take precedence (resolved_bn == "explicit"). Without either path you get identity-BN previews with correct structure but shifted colors (resolved_bn == "none").

See docs/manual-verification.md for the full verification recipe.

Status

  • v0.1.0 — initial public release on PyPI (2026-05-13). All four variants, encoder + decoder, mflux integration, CI, 99 % honest coverage.
  • v0.2.0 — released on PyPI (2026-05-27). Auto-bn extraction in LivePreviewCallback(flux=...); per-step gallery mode (numbered_frames=True); subprocess-per-rep showcase bench (scripts/run_showcase.py); hardware-aware memory caps via mlx_taef._memory_caps; COMPARISON.md + committed JSON report; ROADMAP.md.
  • v0.2.3 — released on PyPI (2026-05-29). Weight loading is now strict: from_pretrained_local raises on an incomplete or wrong-shaped weights file instead of loading a silently-wrong model, and the HF→MLX converter checks parameter coverage and shapes at convert time (new ConversionError). The end-to-end parity tests now gate on an absolute pixel tolerance rather than cosine similarity. A bare pytest skips the network and benchmark tests by default (--run-network / --run-benchmark to opt in).
  • v0.3.0 — released on PyPI (2026-06-06). Internal kernel refactor: each variant is now a composable ModelKernel (mlx_taef.kernels), so adding a model is a self-contained entry; variants.py stays a back-compat shim. Ships one user-facing fix — the mflux LivePreviewCallback FLUX.1 path fed the packed latent straight to the decoder and produced wrong previews; it now unpacks correctly.
  • v0.3.1 — released on PyPI (2026-06-08). Hardening: decode()/encode() raise a clear error when weights haven't been loaded yet, or when a latent has the wrong channel count, instead of returning garbage; importing without mflux installed now raises MfluxNotInstalledError (a TaefError that is also an ImportError).

Track future releases via the PyPI history or gh release list -R IonDen/mlx-taef.

License

MIT. Mirrors upstream madebyollin/taesd license. Pretrained weights belong to their respective authors (madebyollin).

Acknowledgements


By Denis Ineshin · ineshin.space

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_taef-0.3.1.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_taef-0.3.1-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file mlx_taef-0.3.1.tar.gz.

File metadata

  • Download URL: mlx_taef-0.3.1.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mlx_taef-0.3.1.tar.gz
Algorithm Hash digest
SHA256 42fcdc2b49402f3797b460a73584399348aac01e944e98f426720093c62c3eb8
MD5 b407af31e9c8d5e36e3415aa0308e0fd
BLAKE2b-256 92d2be5954e58f834c636a02ce8976270cc43c3ba065d9826dd61105e3f8bffc

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_taef-0.3.1.tar.gz:

Publisher: release.yml on IonDen/mlx-taef

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mlx_taef-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: mlx_taef-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 29.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mlx_taef-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 14551c607ee5d8b949f18ac34884377ea01298c47630b0d243ebe66a6055fe83
MD5 c3e21465595f52fc9122e1647718367f
BLAKE2b-256 ffd08b34f5649e5d72c177c4a932d89b029941819dced3f0fd7e1c5cf15b8336

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_taef-0.3.1-py3-none-any.whl:

Publisher: release.yml on IonDen/mlx-taef

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page