Skip to main content

CLI for AcceleratorModule library (accmt).

Project description

AcceleratorModule CLI

CLI for AcceleratorModule library (accmt).

This is a command-line tool wrapper around Accelerate's command-line utilities ('accelerate').

Installation

accmt-cli is automatically installed when installing accmt library. You also install it via pip:

pip install accmt-cli

Launch

You can launch any distributed training process with the following command:

accmt launch [-N][-n, --gpus][-O1][--strat] <your_python_script> [...]

Where:

  • -N (optional): Corresponds to the number of processes, or a Python-like slice to take GPUs from a certain index (e.g. '-N=2:', to take GPUs from index 2).
  • -n or --gpus (optional): Corresponds to a list of CUDA devices (e.g. '-n=1,3,5,6', to take GPUs indices 1, 3, 5 and 6).
  • -O1 (optional): Corresponds to the optimization of type 1, which calculates the efficient number for 'OMP_NUM_THREADS', depending on how many processes you will run you training script.
  • --strat (optional): Corresponds to the specific strategy to implement, or a configuration file path from Accelerate ('accelerate config --config_file=your-config.yaml'). See 'accmt strats' for specific strategies. ... (optional): You can add here any additional arguments that your Python script might have.

Get model from checkpoint

You can get a model from any checkpoint using the following command:

accmt get <checkpoint> --out=<output-model-directory> [--dtype]

Where:

  • --out or -O (REQUIRED): Output model directory name where to save the model.
  • --dtype (Optional): PyTorch data type of model parameters. Default is 'float32'.

Strats

You can check the specific strats included with the following command:

accmt strats [--ddp][--fsdp][--deepspeed]

Where:

  • --ddp: To only filter for DDP strategies.
  • --fsdp: To only filter for FSDP strategies.
  • --deepspeed: To only filter for DeepSpeed strategies.

Example

Generate an example HPS file config with the following command:

accmt example

This will generate a file on your current directory called 'hps_example.yaml'.

Debug

Enable debug mode with:

accmt debug [--level] ...

Where --level flag is an integer number, which indicates the level of debugging. Available levels are: LEVEL 1:

  • Disables logging (MLFlow, Tensorboard, etc).

LEVEL 2:

  • Disables model and teacher compilation.

LEVEL 3:

  • Disables model saving, checkpointing and resuming (no folders will be created).

LEVEL 4 (default):

  • Force eval_when_start (in Trainer) to False.

LEVEL 5:

  • Disables any evaluation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

accmt_cli-1.4.7.1.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

accmt_cli-1.4.7.1-py3-none-any.whl (63.6 kB view details)

Uploaded Python 3

File details

Details for the file accmt_cli-1.4.7.1.tar.gz.

File metadata

  • Download URL: accmt_cli-1.4.7.1.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for accmt_cli-1.4.7.1.tar.gz
Algorithm Hash digest
SHA256 ae8e3b24131afe1bc046fed9b45ebf609b34dac55c00e188ae201e1f18244b97
MD5 47bdef918524c3bdbf84ff261a07f12b
BLAKE2b-256 a06dd1bff865a96ba4238f6e64ffa2ce40fd0a2620c7f7a43b8fceed2aff206c

See more details on using hashes here.

File details

Details for the file accmt_cli-1.4.7.1-py3-none-any.whl.

File metadata

  • Download URL: accmt_cli-1.4.7.1-py3-none-any.whl
  • Upload date:
  • Size: 63.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for accmt_cli-1.4.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0c04a5cd2532155b829dd74cb041385799dd3d73aea78684df38dbcaec15d7dd
MD5 7ef82660fadc612428e8a899c5dbe2b9
BLAKE2b-256 0933f50c9eff1a2fc0a62cf9ff2165132a9eb6f24e46c97086ced60bbf2a7d03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page