Skip to main content

Unofficial implementation of Momentum Low-Rank Compression (MLorc) for memory-efficient LLM fine-tuning

Project description

MLorc - Momentum Low-Rank Compression for Memory-Efficient LLM Fine-tuning

Unofficial implementation of "MLorc: Momentum Low-rank Compression for Large Language Model Adaptation"

This repository introduces MLorc (Momentum Low-rank Compression), a novel and highly memory-efficient paradigm designed to significantly reduce the memory footprint of full-parameter fine-tuning for large language models. Based on the paper "MLorc: Momentum Low-rank Compression for Large Language Model Adaptation" this method offers a compelling alternative to existing memory-efficient techniques.

image

Install

pip install MLorc-optim


How MLorc Works

MLorc's core innovation lies in its approach to momentum compression and reconstruction:

  • Direct Momentum Compression: It directly compresses and reconstructs both first and second-order momentum using Randomized SVD (RSVD) at each optimization step.
  • Adaptive Second-Order Momentum Handling: To ensure stability, especially for non-negative second-order momentum, MLorc adaptively adds a small constant to zero values introduced by ReLU during reconstruction.

Key Advantages of MLorc

MLorc is broadly applicable to any momentum-based optimizer (e.g., Adam, Lion) and delivers superior performance:

  • State-of-the-Art Performance: Empirically, MLorc consistently outperforms other memory-efficient methods like LoRA and GaLore in terms of validation accuracy. It can even match or exceed the performance of full fine-tuning with a small rank (e.g., rank=4).
  • Memory and Time Efficiency: It maintains comparable memory efficiency to LoRA while demonstrating improved time efficiency compared to GaLore.
  • Theoretical Guarantees: MLorc offers a theoretical guarantee for convergence, matching the convergence rate of the original Lion optimizer under reasonable assumptions.
image

Included MLorc-Integrated Optimizers

This repository integrates MLorc into six momentum-based optimizers, each with additional enhancements for improved performance and stability:

  1. MLorc_AdamW: AdamW with MLorc compression, featuring:

    • Fused Backward Pass
    • Gradient Descent with Adaptive Momentum Scaling (Grams): For better performance and faster convergence.
    • atan2 smoothing & scaling: A robust replacement for eps (no tuning required), which also incorporates gradient clipping. (If enabled, eps is ignored.)
    • OrthoGrad: Prevents "naïve loss minimization" (NLM) that can lead to overfitting by removing the gradient component parallel to the weight, thus improving generalization
  2. MLorc_Prodigy:

  3. MLorc_Lion: Lion with MLorc compression, featuring:

  4. MLorc_DAdapt_Lion:

    • Same Features as MLorc_Lion
    • Integrates MLorc with the DAdaptation adaptive method for LION, and includes the slice_p feature (from Prodigy).
  5. MLorc_Adopt:

  6. MLorc_CAME:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlorc_optim-0.1.7.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlorc_optim-0.1.7-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file mlorc_optim-0.1.7.tar.gz.

File metadata

  • Download URL: mlorc_optim-0.1.7.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for mlorc_optim-0.1.7.tar.gz
Algorithm Hash digest
SHA256 02fa39b0510e7b9be70f3fd5aa9bbff29b03f1417fdfbabd3ffed3c4d236db0f
MD5 53ebffee45d99daedca1f323c5d78513
BLAKE2b-256 3b3542a875835f26216182503d0efd38ffc391404efcc0de4e8489d2fb5d9b55

See more details on using hashes here.

File details

Details for the file mlorc_optim-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: mlorc_optim-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for mlorc_optim-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2470dbb8f93baf63b8501d17f5dc4d4dfdd1bcdc1d85056efe339a75554ff465
MD5 dbc77250be39ead118a3a61d5baa9dd7
BLAKE2b-256 4370efac0b3fbf1de8805bc12d3143d9b0c37e7100a2451c2f2c7f8b03c6197c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page