GEAR
Project description
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
GearLM-0.0.2.tar.gz
(1.2 kB
view hashes)
Built Distributions
GearLM-0.0.2-py3-none-any.whl
(1.2 kB
view hashes)
GEARLM-0.0.2-py3-none-any.whl
(1.2 kB
view hashes)