Skip to main content

AccelHydra is a lightweight, modular, configurable training framework built on Accelerate and Hydra

Project description

AccelHydra

python pytorch accelerate hydra docs license

It is still developing, with potential errors. Welcome bug report and PR!

:sparkles: Introduction

:thinking: What is AccelHydra?

A lightweight, configurable and modular training framework based on Accelerate and Hydra.

It IS:

  • a trainer wrapping PyTorch, providing some basic utility functions to improve the reusability of PyTorch training code.
  • built on accelerate to support various distributed training / inference environments.
  • built on hydra to support modular training configurations and command line overrides, with potential extended features like parameter sweeping.

It IS NOT:

  • a training framework designed for specific tasks (e.g., LLMs, image/audio-related tasks...).
  • a package including various state-of-the-art model implementations.
  • an inference-time accelerating or memory-reducing toolkit.

:bulb: Why you might want to use AccelHydra?

  • Avoid writing boilerplates every time. The training loop and some utility functions remain almost the same across different projects, so we take them out as a basic library.
  • The functionality of config loading is managed by Hydra, while distributed training is managed by Accelerate, so you don't need to worry about these details.
  • Maintain a moderate level of abstraction. Great libraries like PyTorch Lightning and Transformers are powerful, but their codebases are too deep for newcomers to understand, or lack convenient interface to modify. We don't want a Trainer with dozens of inheritence layers, nor a single train.py with all logics in thousands of lines.
  • Similar codebases can be found: lightning-hydra-template, lightning-accelerate. However, task-specific codes and base codes (base classes and utility functions) should be separated to continuously fix bugs in base codes. Therefore the principle here is to only integrate generic codes into the library, instead of codes designed for specific tasks (CV, NLP, RL, ...)

:warning: Why you might not want to use AccelHydra?

  • Overriding may sometimes become complicated for Hydra. Breaking training configs into different components make things clear, but it may also fail from time to time.
  • Efficiency consideration. This library is suitable for acamedic research to implement an idea and test its applicability, but the efficiency for data loading, training, and inference are not involved in this library.

:package: Installation

This repositiry is tested on Python 3.10+. We recommend creating a new virtual envrionment before installing AccelHydra to avoid breaking existing environments:

pip install accel_hydra
# or
pip install git+https://github.com/wsntxxn/AccelHydra

:computer: Usage

Check out the documentation or have a look at examples.

Basically, to use AccelHydra for training, you need to implement your own datasets, models, loss functions, and trainer:

  • For basic functions and classes provided by AccelHydra, you don't need to implement again.
  • The trainer should inherit accel_hydra.Trainer and implements necessary function training_step (and validation_step if validation is used).
  • Write Hydra-style YAML configs with the top-level train.yaml.
  • (Optional) Write a custom TrainLauncher if you need to customize the training setup.
  • (Optional) Write a custom training entry script train.py.
  • Launch training using the built-in entry point (for example, 8 GPUs on 2 nodes, fp16):
accelerate launch \
  --num_processes 8 \
  --num_machines 2 \
  --mixed_precision fp16 \
  -m accel_hydra.train_entry \
  -c configs/train.yaml

:memo: Note

Currently AccelHydra is only tested on GPU nodes. Welcome to test this library on more machines and create corresponding PRs.

:book: Citation

If you found this repository useful, please consider citing

@misc{
  title={AccelHydra: A Lightweight, Configurable and Modular Training Framework based on Accelerate and Hydra.},
  author={Xuenan Xu and Yixuan Li and Jiahao Mei},
  howpublished={\url{https://github.com/wsntxxn/AccelHydra}},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

accel_hydra-0.0.4.tar.gz (182.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

accel_hydra-0.0.4-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file accel_hydra-0.0.4.tar.gz.

File metadata

  • Download URL: accel_hydra-0.0.4.tar.gz
  • Upload date:
  • Size: 182.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for accel_hydra-0.0.4.tar.gz
Algorithm Hash digest
SHA256 5a67076e159100a79e6fbd12ada8118e21889e40d0eaefe684997c5a942cf949
MD5 771e30d9732edf67a20f77028f198840
BLAKE2b-256 a3411cbe1478055619de5d3afb43ffa4d2725f2a48755bcf8843c55c05aa2a45

See more details on using hashes here.

File details

Details for the file accel_hydra-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: accel_hydra-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for accel_hydra-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 504fc724ebe6c4da1e41b4a51c27547f71a2e75bcc079f2c10348945db5ccc6b
MD5 8ae1db461dd16202d3a4ef14433589b8
BLAKE2b-256 53437f8ffe5ba24ece2ca1700dd0ded23f83077f7d8a7e2a3a8817df0ff1fac2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page