Skip to main content

No project description provided

Project description

MindSpore with Ascend on ModelArts Launcher

Release Status CI Status Documentation Status

A simple and clean launcher helps you train deep model using MindSpore with Ascend on ModelArts(ROMA or HuaweiCould), no bells and whistles.

[!NOTE] This project is a shameless wrapper of scripts from HuaweiCloud, all credit goes to them.

Installation

pip install ma2l

Or, you can get pre-released version from test.pypi.org

pip install -i https://test.pypi.org/simple/ ma2l

Usage

Just submit the following command to your training job on cloud:

ma2l YOUR_TRAINING_COMMAND

For example, YOUR_TRAINING_COMMAND might be like:

python your_train_script.py \
    --arg1=value1 \
    --arg2=value2 \
    ...

See the difference between the commands on the local machine and on the cloud:

- python your_train_script.py \
+ ma2l python your_train_script.py \
    --arg1=value1 \
    --arg2=value2 \
    ...

[!IMPORTANT] Don't forget to pass the argument that turns on distributed training to your training script, if it requires one.

Features

  • No need to change a single line of the training script.
  • There's no need to set any distribution-related environment variables, and we'll take care of everything for you.
  • Supports a variety of hardware settings:
    • single-node, single-npu
    • single-node, multi-npus
    • multi-node, multi-npus
  • Modularity. Fully decoupled from your training code/repository.

Philosophies

So, what happens under the hood? After you have created a training job on ModelArts, the launcher does the following:

  1. Generate HCCL configuration files on each node, typically named rank_table.json, which is necessary for distributed training.
  2. Automatically start n processes on each node for YOUR_TRAINING_COMMAND, based on the settings when you created the job.
  3. Set the environment variables for each process on each node, such as RANK_TABLE_FILE, DEVICE_ID, RANK_ID, etc.

Simple and Easy, right? You don't need to change any code to adapt training scripts from your local machine to the cloud, and you don't need to struggle with environment variable settings on the local machine. Keep it in mind.

FAQs

What does ma2l mean?

ma2l is the abbreviation for MindSpore with Ascend on ModelArts Launcher.

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ma2l-0.1.2.tar.gz (10.5 kB view hashes)

Uploaded Source

Built Distribution

ma2l-0.1.2-py3-none-any.whl (10.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page