jang · PyPI

JANG — Adaptive Mixed-Precision Quantization for Apple Silicon. The GGUF equivalent for MLX.

These details have not been verified by PyPI

Project links

Project description

MLX Studio — the only app that natively supports JANG models

Early Adoption: LM Studio, Ollama, oMLX, Inferencer do not support JANG yet. Use MLX Studio or pip install "jang[mlx]". Ask your favorite app's creators to add JANG support!

JANG

Jang Adaptive N-bit Grading

Mixed-Precision Quantization for Apple Silicon

The GGUF equivalent for MLX — models stay quantized in GPU memory at full Metal speed.

License Python Platform PyPI

Website • Models • PyPI • Format Spec

Results (200-question MMLU)

MoE at 4-bit: JANG_4K beats MLX

Model	JANG_4K	MLX 4-bit	JANG Size	MLX Size
Qwen3.5-122B	86%	85%	69 GB	64 GB
Qwen3.5-35B	77.5%	75.5%	16.7 GB	18 GB

MoE at 2-bit: JANG dominates

Model	JANG_2S	MLX 2-bit	JANG Size	MLX Size
Qwen3.5-122B	79%	56.5%	38 GB	36 GB
Qwen3.5-35B	65.5%	~20%	12 GB	10 GB

MiniMax: JANG is the ONLY working option

Model	JANG_2L	MLX 4-bit	MLX 3-bit	MLX 2-bit
MiniMax-M2.5	74%	26.5%	24.5%	25%

MLX is broken on MiniMax at ALL bit levels (~25% = random). JANG scores 74%.

Dense/Hybrid at 2-bit: JANG saves what MLX destroys

Model	JANG_2S	MLX 2-bit	JANG Size	MLX Size
Qwen3.5-4B	28.5%	12.5%	1.5 GB	1.2 GB
Qwen3.5-9B	25.5%	22.0%	3.4 GB	2.7 GB

At 3-bit and 4-bit, MLX uniform is better on dense models — JANG's value is at 2-bit (where uniform fails) and on MoE (where attention is < 5% of params).

Install

pip install jang

For inference on Apple Silicon:

pip install "jang[mlx]"

For Vision-Language models:

pip install "jang[vlm]"

Quick Start

Convert any model

# K-quant 4-bit (same size as MLX, smarter allocation)
jang convert Qwen/Qwen3.5-35B-A3B -p 4

# 2-bit for extreme compression
jang convert Qwen/Qwen3.5-122B-A10B -p 2

# Specific profile
jang convert model -p JANG_2S

Run inference

from jang_tools.loader import load_jang_model
from mlx_lm.sample_utils import make_sampler
from mlx_lm.generate import generate_step
import mlx.core as mx

model, tokenizer = load_jang_model("JANGQ-AI/Qwen3.5-122B-A10B-JANG_2S")
sampler = make_sampler(temp=0.7)

tokens = tokenizer.encode("What is photosynthesis?")
for tok, _ in generate_step(prompt=mx.array(tokens), model=model, max_tokens=200, sampler=sampler):
    t = tok.item() if hasattr(tok, 'item') else int(tok)
    print(tokenizer.decode([t]), end="", flush=True)
    if t == tokenizer.eos_token_id:
        break

Upgrade v1 models to v2 (instant loading)

jang upgrade /path/to/model

CLI Commands

Command	Description
`jang convert <model> -p <profile>`	Convert HuggingFace model to JANG
`jang upgrade <model>`	Upgrade v1 model to v2 (instant load)
`jang inspect <model>`	Show bit allocation and model info
`jang validate <model>`	Validate a JANG model directory
`jang estimate <params>`	Estimate sizes (e.g., `jang estimate 122B`)

v2 Format — Instant Loading

JANG v2 stores weights in MLX-native format. Like GGUF — the file IS the runtime format. No conversion at load time.

	v2 (current)	v1 (legacy)
Load time	Seconds (mmap)	5-10 minutes (repack)
File size	Same	Same

New conversions automatically use v2. Existing v1 models can be upgraded with jang upgrade.

Profiles

Profile	Type	Bits	Best for
`JANG_4K`	K-quant	4.0	Same size as MLX 4-bit, smarter
`JANG_3K`	K-quant	3.0	Same size as MLX 3-bit, smarter
`JANG_2S`	Profile	~2.1	Tightest 2-bit, near MLX 2-bit size
`JANG_2L`	Profile	~2.3	Quality 2-bit
`JANG_1L`	Profile	~2.2	Maximum quality 2-bit

Pre-quantized Models

Model	Profile	MMLU (200q)	Size	Best for
Qwen3.5-122B-A10B	JANG_4K	86%	69 GB	192+ GB Mac
Qwen3.5-122B-A10B	JANG_2S	79%	38 GB	64+ GB Mac
Qwen3.5-35B-A3B	JANG_4K	77.5%	16.7 GB	36+ GB Mac
Qwen3.5-35B-A3B	JANG_2S	65.5%	12 GB	24+ GB Mac
MiniMax-M2.5	JANG_2L	74%	89 GB	192+ GB Mac
Qwen3.5-9B	JANG_2S	25.5%	3.4 GB	8 GB MacBook
Qwen3.5-4B	JANG_2S	28.5%	1.5 GB	8 GB MacBook

Supported Architectures

Dense Transformer, Mixture of Experts, Hybrid SSM, Linear Attention (GatedDeltaNet), MLA (DeepSeek), Vision-Language, Mamba, FP8 source models (MiniMax, DeepSeek).

How It Works

JANG redistributes bits based on tensor sensitivity — same total size, smarter allocation:

CRITICAL  (attention, MoE routers)   →  6-8 bit  →  Controls coherence
IMPORTANT (embeddings, linear attn)  →  4-6 bit  →  Moderate sensitivity
COMPRESS  (MLP, MoE experts)         →  2-4 bit  →  98% of parameters

K-quant profiles (JANG_4K, JANG_3K) redistribute within the same bit budget — boost attention, compensate with least-important MLP. Same size as MLX, smarter allocation. Like GGUF K-quants.

Requirements

Python: 3.11+
Conversion: any platform (numpy + safetensors)
Inference: Apple Silicon Mac (M1/M2/M3/M4) with MLX
Dependencies: safetensors>=0.4, numpy>=1.24, tqdm>=4.60, huggingface_hub>=0.20
Optional: mlx>=0.22, mlx-lm>=0.20 (for inference), mlx-vlm>=0.1 (for VLM)

한국어

JANG이란?

JANG은 Apple Silicon을 위한 오픈소스 혼합정밀도 양자화 포맷입니다. MLX를 위한 GGUF와 같은 역할을 합니다.

결과 (200문항 MMLU)

4-bit: JANG_4K가 MLX 4-bit보다 우수 (MoE 모델)

모델	JANG_4K	MLX 4-bit	크기
Qwen3.5-122B	86%	85%	69 vs 64 GB
Qwen3.5-35B	77.5%	75.5%	16.7 vs 18 GB

2-bit: JANG이 MLX를 압도

모델	JANG_2S	MLX 2-bit	크기
Qwen3.5-122B	79%	56.5%	38 vs 36 GB
Qwen3.5-35B	65.5%	~20%	12 vs 10 GB

MiniMax: JANG만 작동

모델	JANG_2L	MLX 4-bit	MLX 3-bit	MLX 2-bit
MiniMax-M2.5	74%	26.5%	24.5%	25%

설치

pip install "jang[mlx]"

호환성

현재 **MLX Studio**만 JANG 포맷을 기본 지원합니다. LM Studio, Ollama, oMLX, Inferencer 등은 아직 지원하지 않습니다. 좋아하는 앱의 개발자에게 JANG 지원을 요청해 주세요!

GitHub · HuggingFace · MLX Studio · PyPI

장진호 제작 · Created by Jinho Jang — jangq.ai

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.9

Apr 30, 2026

2.5.8

Apr 26, 2026

2.5.4

Apr 24, 2026

2.5.3

Apr 24, 2026

2.5.2

Apr 24, 2026

2.5.1

Apr 24, 2026

2.5.0

Apr 24, 2026

2.4.2

Apr 22, 2026

2.3.2

Apr 5, 2026

2.3.1

Apr 4, 2026

2.3.0

Mar 26, 2026

2.2.0

Mar 23, 2026

2.1.5

Mar 20, 2026

2.1.4

Mar 20, 2026

2.1.3

Mar 18, 2026

2.1.2

Mar 18, 2026

This version

2.1.1

Mar 18, 2026

2.1.0

Mar 18, 2026

2.0.1

Mar 17, 2026

2.0.0

Mar 17, 2026

1.4.0

Mar 17, 2026

1.3.0

Mar 16, 2026

1.2.1

Mar 16, 2026

1.2.0

Mar 16, 2026

1.1.0

Mar 16, 2026

1.0.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jang-2.1.1.tar.gz (62.1 kB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

jang-2.1.1-py3-none-any.whl (66.1 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file jang-2.1.1.tar.gz.

File metadata

Download URL: jang-2.1.1.tar.gz
Upload date: Mar 18, 2026
Size: 62.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jang-2.1.1.tar.gz
Algorithm	Hash digest
SHA256	`17bad46e22fbc4cab0ed78ceaeb61ec36dc7cf93a4372d0a9f9214abc89dc8dc`
MD5	`6370ba4dfcd87551a99d3bb3652332a6`
BLAKE2b-256	`c417e482816870dc4d0073f95be6ce704e6406e474c862f92dfea77dd583ac91`

See more details on using hashes here.

File details

Details for the file jang-2.1.1-py3-none-any.whl.

File metadata

Download URL: jang-2.1.1-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 66.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jang-2.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`895c72b84661ebab93f4bac4278fa32c00207290c525b5e9a28a6f9a2234ced2`
MD5	`9682c40b7792049670ff9bc6335db899`
BLAKE2b-256	`6d4a37cb0c1861adf7a6311e8b66f13a7f7c1f800835c4dd7c02b856edf79e69`

See more details on using hashes here.

jang 2.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MLX Studio — the only app that natively supports JANG models

Jang Adaptive N-bit Grading

Mixed-Precision Quantization for Apple Silicon

Results (200-question MMLU)

MoE at 4-bit: JANG_4K beats MLX

MoE at 2-bit: JANG dominates

MiniMax: JANG is the ONLY working option

Dense/Hybrid at 2-bit: JANG saves what MLX destroys

Install

Quick Start

Convert any model

Run inference

Upgrade v1 models to v2 (instant loading)

CLI Commands

v2 Format — Instant Loading

Profiles

Pre-quantized Models

Supported Architectures

How It Works

Requirements

Links

한국어

JANG이란?

결과 (200문항 MMLU)

4-bit: JANG_4K가 MLX 4-bit보다 우수 (MoE 모델)

2-bit: JANG이 MLX를 압도

MiniMax: JANG만 작동

설치

호환성

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes