7 projects
Defuser
Model defuser helper for HF Transformers.
GPTQModel
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Device-SMI
Retrieve gpu, cpu, and npu device info and properties from Linux/MacOS with zero package dependency.
Evalution
Modern LLM model evaluation for Transformers, SGLang, vLLM, TensorRT-LLM, llama.cpp, GPTQModel, OpenAI-compatible HTTP backends, and OpenVINO.
LogBar
A unified Logger and ProgressBar util with zero dependencies.
TokeNicer
A (nicer) tokenizer you want to use for model `inference` and `training`: with all known peventable `gotchas` normalized or auto-fixed.
PyPcre
Modern, GIL-friendly, Fast Python bindings for PCRE2 with auto caching and JIT of compiled patterns.