Profile of ModelCloud

Last released Jul 2, 2026

Model defuser helper for HF Transformers.

Last released Jun 8, 2026

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Last released Apr 23, 2026

Retrieve gpu, cpu, and npu device info and properties from Linux/MacOS with zero package dependency.

Last released Apr 19, 2026

Modern LLM model evaluation for Transformers, SGLang, vLLM, TensorRT-LLM, llama.cpp, GPTQModel, OpenAI-compatible HTTP backends, and OpenVINO.

Last released Apr 15, 2026

A unified Logger and ProgressBar util with zero dependencies.

Last released Apr 15, 2026

A (nicer) tokenizer you want to use for model `inference` and `training`: with all known peventable `gotchas` normalized or auto-fixed.

Last released Apr 15, 2026

Modern, GIL-friendly, Fast Python bindings for PCRE2 with auto caching and JIT of compiled patterns.

ModelCloud