Profile of otdoges

Last released Apr 20, 2026

High-performance MiniMax M2.7 inference library optimized for GMKtech M7

Last released Apr 13, 2026

TurboQuant: Extreme compression for AI models with near-optimal distortion rates

Last released Apr 12, 2026

PetaLLM allows single 4GB GPU card to run 70B large language models without quantization, distillation or pruning. 8GB vmem to run 405B Llama3.1.

Last released Apr 2, 2026

Kimi K2.5 (1.1T) optimization suite for RTX 3090 with aggressive RAM optimization

Last released Apr 2, 2026

Zero-config model quantization for notebooks and Python

Last released Mar 29, 2026

Run 70B+ LLMs on consumer GPUs with layer-wise inference

Jackson Wheeler