Skip to main content
Avatar for RedHatAI Admin from gravatar.com

RedHatAI Admin

Username    neuralmagic
Date joined   Joined

14 projects

compressed-tensors

Last released

Library for utilization of compressed safetensors of neural network models

guidellm

Last released

Guidance platform for deploying and managing large language models.

llmcompressor

Last released

A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation.

speculators

Last released

A unified library for creating, representing, and storing speculative decoding algorithms for LLM serving such as in vLLM.

deepsparse-ent

Last released

[DEPRECATED] An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application

deepsparse

Last released

[DEPRECATED] An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application

sparseml

Last released

[DEPRECATED] Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

sparsezoo

Last released

[DEPRECATED] Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

sparsify

Last released

[DEPRECATED] Easy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint

llmcompressor-nightly

Last released

A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation.

compressed-tensors-nightly

Last released

Library for utilization of compressed safetensors of neural network models

nm-vllm

Last released

A high-throughput and memory-efficient inference and serving engine for LLMs

nm-magic-wand-nightly

Last released

SparseLinear layers

nm-magic-wand

Last released

SparseLinear layers

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page