Profile of gavinli

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

1 project

Last released Sep 21, 2024

AirLLM allows single 4GB GPU card to run 70B large language models without quantization, distillation or pruning. 8GB vmem to run 405B Llama3.1.

Supported by