Skip to main content

JANG — Adaptive Mixed-Precision Quantization for Apple Silicon. The GGUF equivalent for MLX.

Project description

MLX Studio

MLX Studio App

MLX Studio — the only app that natively supports JANG models


Compatibility Notice

JANG is a new quantization format. The following apps do NOT support JANG yet:

  • LM Studio — does not support JANG
  • Ollama — does not support JANG
  • oMLX — does not support JANG
  • Inferencer — does not support JANG

MLX Studio is currently the only app with native JANG support. You can also use the jang Python package directly (pip install "jang[mlx]").

Want JANG support in your favorite app? Ask the developers to add it! JANG is open-source (GitHub) and the format spec is public (FORMAT.md).


JANG

Jang Adaptive N-bit Grading

Mixed-Precision Quantization for Apple Silicon

The GGUF equivalent for MLX — models stay quantized in GPU memory at full Metal speed.

What is JANG?

JANG redistributes quantization bits based on tensor sensitivity. Critical layers (attention) get more bits, bulk layers (MLP) compensate — same total size, smarter allocation.

Like GGUF K-quants for MLX.

Results

2-bit: JANG doubles MLX on every model

Model JANG_2S MLX 2-bit Size
Qwen3.5-122B MoE 84% MMLU 56% 38 GB vs 36 GB
Qwen3.5-35B MoE 62% MMLU ~20% 12 GB vs 10 GB
Qwen3.5-9B 36% MMLU 18% 3.5 GB vs 2.6 GB
Qwen3.5-4B 28% MMLU 14% 1.6 GB vs 1.3 GB

4-bit: JANG_4K — smaller than MLX, higher MMLU

Model JANG_4K MLX 4-bit Size
Qwen3.5-35B MoE 84% MMLU 82% 16.7 GB vs 18 GB

Install

pip install jang

For inference on Apple Silicon:

pip install "jang[mlx]"

Quick Start

Convert any model

# K-quant 4-bit (budget-neutral, same size as MLX, smarter)
jang convert Qwen/Qwen3.5-35B-A3B -p 4

# 2-bit for extreme compression
jang convert Qwen/Qwen3.5-122B-A10B -p 2

# Specific profile
jang convert model -p JANG_2S

Run inference

from jang_tools.loader import load_jang_model
from mlx_lm.sample_utils import make_sampler
from mlx_lm.generate import generate_step
import mlx.core as mx

model, tokenizer = load_jang_model("JANGQ-AI/Qwen3.5-122B-A10B-JANG_2S")
sampler = make_sampler(temp=0.7)

tokens = tokenizer.encode("What is photosynthesis?")
for tok, _ in generate_step(prompt=mx.array(tokens), model=model, max_tokens=200, sampler=sampler):
    t = tok.item() if hasattr(tok, 'item') else int(tok)
    print(tokenizer.decode([t]), end="", flush=True)
    if t == tokenizer.eos_token_id:
        break

Profiles

Profile Type Bits Best for
JANG_4K K-quant 4.0 Same size as MLX 4-bit, smarter
JANG_3K K-quant 3.0 Same size as MLX 3-bit, smarter
JANG_2S Profile ~2.1 Tightest 2-bit, near MLX 2-bit size
JANG_2M Profile ~2.1 Balanced 2-bit
JANG_2L Profile ~2.3 Quality 2-bit
JANG_1L Profile ~2.2 Maximum quality 2-bit

Pre-quantized Models

Available on HuggingFace:

Model Profile MMLU Size
Qwen3.5-122B-A10B JANG_2S 84% 38 GB
Qwen3.5-35B-A3B JANG_4K 84% 16.7 GB
Qwen3.5-35B-A3B JANG_2S 62% 12 GB

Supported Architectures

Dense Transformer, Mixture of Experts, Hybrid SSM, Linear Attention, MLA, Vision-Language, Mamba, FP8 source models.

Links


Created by Jinho Jang — jangq.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jang-1.2.1.tar.gz (50.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jang-1.2.1-py3-none-any.whl (54.3 kB view details)

Uploaded Python 3

File details

Details for the file jang-1.2.1.tar.gz.

File metadata

  • Download URL: jang-1.2.1.tar.gz
  • Upload date:
  • Size: 50.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jang-1.2.1.tar.gz
Algorithm Hash digest
SHA256 8a0202c9c179a056aef677b68ae7259baa15f30f219e09d2775a233bc802d64c
MD5 410a36abb4f08f7e995dbda41ff596b3
BLAKE2b-256 479d7a58ac24a199df0f0bf389d2dfdfacc692d37591b355a3c09c3b20461fb9

See more details on using hashes here.

File details

Details for the file jang-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: jang-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 54.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jang-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1fb6393742f0a858a87866fd40b36a430778915d868bafd0e31a69ac6c35868a
MD5 2a5a14b69983cfb61072b1a63cbf6e88
BLAKE2b-256 be7f6865f9ba76ad0db9b18d40e774e531a928f17d570c3dbb5c44d2842ba3b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page