JANG — Adaptive Mixed-Precision Quantization for Apple Silicon. The GGUF equivalent for MLX.
Project description
Run JANG models with MLX Studio — the easiest way to run LLMs on Apple Silicon
Early Adoption: JANG is a new quantization format. LM Studio, Ollama, oMLX, and other MLX inference apps do not support JANG yet. Use MLX Studio (native JANG support) or the
jangPython package for inference. Ask your favorite app's creators to add JANG support!
Jang Adaptive N-bit Grading
Mixed-Precision Quantization for Apple Silicon
The GGUF equivalent for MLX — models stay quantized in GPU memory at full Metal speed.
What is JANG?
JANG redistributes quantization bits based on tensor sensitivity. Critical layers (attention) get more bits, bulk layers (MLP) compensate — same total size, smarter allocation.
Like GGUF K-quants for MLX.
Results
2-bit: JANG doubles MLX on every model
| Model | JANG_2S | MLX 2-bit | Size |
|---|---|---|---|
| Qwen3.5-122B MoE | 84% MMLU | 56% | 38 GB vs 36 GB |
| Qwen3.5-35B MoE | 62% MMLU | ~20% | 12 GB vs 10 GB |
| Qwen3.5-9B | 36% MMLU | 18% | 3.5 GB vs 2.6 GB |
| Qwen3.5-4B | 28% MMLU | 14% | 1.6 GB vs 1.3 GB |
4-bit: JANG_4K — smaller than MLX, higher MMLU
| Model | JANG_4K | MLX 4-bit | Size |
|---|---|---|---|
| Qwen3.5-35B MoE | 84% MMLU | 82% | 16.7 GB vs 18 GB |
Install
pip install jang
For inference on Apple Silicon:
pip install "jang[mlx]"
Quick Start
Convert any model
# K-quant 4-bit (budget-neutral, same size as MLX, smarter)
jang convert Qwen/Qwen3.5-35B-A3B -p 4
# 2-bit for extreme compression
jang convert Qwen/Qwen3.5-122B-A10B -p 2
# Specific profile
jang convert model -p JANG_2S
Run inference
from jang_tools.loader import load_jang_model
from mlx_lm.sample_utils import make_sampler
from mlx_lm.generate import generate_step
import mlx.core as mx
model, tokenizer = load_jang_model("JANGQ-AI/Qwen3.5-122B-A10B-JANG_2S")
sampler = make_sampler(temp=0.7)
tokens = tokenizer.encode("What is photosynthesis?")
for tok, _ in generate_step(prompt=mx.array(tokens), model=model, max_tokens=200, sampler=sampler):
t = tok.item() if hasattr(tok, 'item') else int(tok)
print(tokenizer.decode([t]), end="", flush=True)
if t == tokenizer.eos_token_id:
break
Profiles
| Profile | Type | Bits | Best for |
|---|---|---|---|
JANG_4K |
K-quant | 4.0 | Same size as MLX 4-bit, smarter |
JANG_3K |
K-quant | 3.0 | Same size as MLX 3-bit, smarter |
JANG_2S |
Profile | ~2.1 | Tightest 2-bit, near MLX 2-bit size |
JANG_2M |
Profile | ~2.1 | Balanced 2-bit |
JANG_2L |
Profile | ~2.3 | Quality 2-bit |
JANG_1L |
Profile | ~2.2 | Maximum quality 2-bit |
Pre-quantized Models
Available on HuggingFace:
| Model | Profile | MMLU | Size |
|---|---|---|---|
| Qwen3.5-122B-A10B | JANG_2S | 84% | 38 GB |
| Qwen3.5-35B-A3B | JANG_4K | 84% | 16.7 GB |
| Qwen3.5-35B-A3B | JANG_2S | 62% | 12 GB |
Supported Architectures
Dense Transformer, Mixture of Experts, Hybrid SSM, Linear Attention, MLA, Vision-Language, Mamba, FP8 source models.
Links
Created by Jinho Jang — jangq.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jang-1.2.0.tar.gz.
File metadata
- Download URL: jang-1.2.0.tar.gz
- Upload date:
- Size: 50.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c31ddfb82e24548525a40f604703d6eeb132c31fad06632b1b609d8bee3a2e6f
|
|
| MD5 |
f45c6699b8be3ebaa503ddd33ed8e51b
|
|
| BLAKE2b-256 |
5e12327d733e5c18f284f8e4d17eb2afb1841fafcc51541accdec6c05df1ebc6
|
File details
Details for the file jang-1.2.0-py3-none-any.whl.
File metadata
- Download URL: jang-1.2.0-py3-none-any.whl
- Upload date:
- Size: 54.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f32e653c7721f37134bde1c1e560149da8ab5615a4d512ba44e2d26603c98fd2
|
|
| MD5 |
aaa3057fad32018d0ce3b0cecd47b61c
|
|
| BLAKE2b-256 |
c91929db35ce0278008116f2804e6fd7034f2ed274fbd6e9f863438b4b4fe102
|