Skip to main content

A benchmarking tool for AutoML

Project description

AutoGluon-Bench

Welcome to AutoGluon-Bench, a suite for benchmarking your AutoML frameworks.

Setup

Follow the steps below to set up autogluon-bench:

# create virtual env and update pip
python3 -m venv .venv_agbench
source .venv_agbench/bin/activate
python3 -m pip install --upgrade pip

Install autogloun-bench from PyPI:

python3 -m pip install autogluon.bench

Install autogluon-bench from source for development:

git clone https://github.com/autogluon/autogluon-bench.git
cd autogluon-bench

# install from source in editable mode
pip install -e ".[tests]"

For development, please be aware that autogluon.bench is installed as a dependency in certain places, such as the Dockerfile and Multimodal Setup. We made it possible to reflect the development changes by pushing the changes to a remote GitHub branch, and providing the URI when testing on benchmark runs:

agbench run sample_configs/multimodal_cloud_configs.yaml --dev-branch https://github.com/<username>/autogluon-bench.git#<dev_branch>

Run benchmarks locally

To run the benchmarks on your local machine, use the following command:

agbench run path/to/local_config_file

Check out our sample local configuration files for local runs.

The results are stored in the following directory: {WORKING_DIR}/{root_dir}/{module}/{benchmark_name}_{timestamp}.

Tabular Benchmark

To perform tabular benchmarking, set the module to tabular. You must set both Benchmark Configurations and Tabular Specific configurations, and each should have a single value. Refer to the sample configuration file for more details.

The tabular module leverages the AMLB benchmarking framework. Required and optional AMLB arguments are specified via the configuration file mentioned previously.

Custom configuration is supported by providing a local directory to amlb_user_dir in the config, by which custom frameworks, constraints and datasets can be overriden. We have a minimum working custom config setup for benchmarking on a custom framework (a AutoGluon dev branch). In the sample configuration file, change the following field to:

framework: AutoGluon_dev:example
amlb_user_dir: path_to/sample_configs/amlb_configs 

For more customizations, please follow the example custom configuration folder provided by AMLB and their documentation.

Multimodal Benchmark

For multimodal benchmarking, set the module to multimodal. We currently support benchmarking multimodal on a custom branch. Note that multimodal benchmarking directly calls the MultiModalPredictor, bypassing the extra layer of AMLB. Therefore, the required arguments are different from those for tabular.

You can add more datasets to your benchmarking jobs. We support custom datasets with custom defined data loaders. Follow these steps:

  1. Create a folder under the working directory, e.g. custom_dataloader/
  2. Create a dataset yaml file, custom_dataloader/datasets.yaml which includes all required properties for your problem type, please refer to the function.
  3. Create a dataset loader class, custom_dataloader/dataloader.py, which downloads and loads the dataset as a dataframe. Please set the required properties as mentioned above.
  4. Add custom_dataloader in the agbench run configuration, where dataloader_file, class_name and dataset_config_file are required.
  5. Make sure you have the proper permission to download the dataset. If running in AWS mode, we support downloading from the S3 bucket specified as DATA_BUCKET in the agbench run configuration under the same AWS Batch deployment account.

Please refer to sample_configs/dataloaders for more examples.

Run benchmarks on AWS

AutoGluon-Bench uses the AWS CDK to build an AWS Batch compute environment for benchmarking.

To get started, install Node.js and AWS CDK with the following instructions:

  1. Install Node Version Manager.
  2. Source profile or restart the terminal.
  3. Follow the Prerequisites section on the AWS CDK Guide and install an appropriate Node.js version for your system:
nvm install $VERSION  # install Node.js
npm install -g aws-cdk  # install aws-cdk
cdk --version  # verify the installation, you might need to update the Node.js version depending on the log.
  1. Follow the AWS CLI Installation Guide to install awscliv2.

If it is the first time using CDK to deploy to an AWS environment (An AWS environment is a combination of an AWS account and Region), please run the following:

cdk bootstrap aws://CDK_DEPLOY_ACCOUNT/CDK_DEPLOY_REGION

You will need a cloud configuration file to run the benchmarks. You can edit the provided sample cloud config files, or use the CLI tool to generate the cloud config files locally.

For multimodal:

agbench generate-cloud-config --module multimodal --cdk-deploy-account <AWS_ACCOUNT_ID> --cdk-deploy-region <AWS_ACCOUNT_REGION> --prefix <PREFIX> --metrics-bucket <METRICS_BUCKET> --git-uri-branch <GIT_URI#BRANCH> --dataset-names DATASET_1,DATASET_2 --presets <PRESET_1>,<PRESET_2> --time-limit <TIME_LIMIT_1>,<TIME_LIMIT_2> --hyperparameters "key_1:value_1,key_2:value_2;key_1:value_3,key_2:value_4"

For tabular:

agbench generate-cloud-config --module tabular --cdk-deploy-account <AWS_ACCOUNT_ID> --cdk-deploy-region <AWS_ACCOUNT_REGION> --prefix <PREFIX> --metrics-bucket <METRICS_BUCKET> --framework <FRAMEWORK>:<LABEL> --amlb-benchmark <BENCHMARK1>,<BENCHMARK2> --amlb-task "BENCHMARK1:DATASET1,DATASET2;BENCHMARK2:DATASET3" --amlb-constraint <CONSTRAINT>

For more details, you can run

agbench generate-cloud-config --help

After having the configuration file ready, use the command below to initiate benchmark runs on cloud:

agbench run /path/to/cloud_config_file

This command automatically sets up an AWS Batch environment using instance specifications defined in the cloud_config_file. It also creates a lambda function named with your chosen LAMBDA_FUNCTION_NAME. This lambda function is automatically invoked with the cloud config file you provided, submitting multiple AWS Batch jobs to the job queue (named with the PREFIX you provided).

In order for the Lambda function to submit multiple jobs simultaneously, you need to specify a list of values for each module-specific key. Each combination of configurations is saved and uploaded to your specified METRICS_BUCKET in S3, stored under S3://{METRICS_BUCKET}/configs/{BENCHMARK_NAME}_{timestamp}/{BENCHMARK_NAME}_split_{UID}.yaml. Here, UID is a unique ID assigned to the split.

The AWS infrastructure configurations and submitted job IDs are saved locally at {WORKING_DIR}/{root_dir}/{module}/{benchmark_name}_{timestamp}/aws_configs.yaml. You can use this file to check the job status at any time:

agbench get-job-status --config-file /path/to/aws_configs.yaml

You can also check the job status using job IDs:

agbench get-job-status --job-ids JOB_ID_1 --job-ids JOB_ID_2 —cdk_deploy_region AWS_REGION

Job logs can be viewed on the AWS console. Each job has an UID attached to the name, which you can use to identify the respective config split. After the jobs are completed and reach the SUCCEEDED status in the job queue, you'll find metrics saved under S3://{METRICS_BUCKET}/{module}/{benchmark_name}_{timestamp}/{benchmark_name}_{timestamp}_{UID}.

A cloud configuration file with time-stamped benchmark_name is also saved under {WORKING_DIR}/{root_dir}/{module}/{benchmark_name}_{timestamp}/{module}_cloud_configs.yaml

By default, the infrastructure created is retained for future use. To automatically remove resources after the run, use the --remove_resources option:

agbench run path/to/cloud_config_file --remove-resources

This will check the job status every 2 minutes and remove resources after all jobs succeed. If any job fails, resources will be kept.

If you want to manually remove resources later, use:

agbench destroy-stack --config-file `{WORKING_DIR}/{root_dir}/{module}/{benchmark_name}_{timestamp}/aws_configs.yaml`

Or you can remove specific stacks by running:

agbench destroy-stack --static-resource-stack STATIC_RESOURCE_STACK_NAME --batch-stack BATCH_STACK_NAME --cdk-deploy-account CDK_DEPLOY_ACCOUNT --cdk-deploy-region CDK_DEPLOY_REGION

where you can find all argument values in {WORKING_DIR}/{root_dir}/{module}/{benchmark_name}_{timestamp}/aws_configs.yaml.

Configure the AWS infrastructure

The default infrastructure configurations are located here.

where:

  • CDK_DEPLOY_ACCOUNT and CDK_DEPLOY_REGION should be overridden with your AWS account ID and desired region to create the stack.
  • PREFIX is used as an identifier for the stack and resources created.
  • RESERVED_MEMORY_SIZE is used together with the instance memory size to calculate the container shm_size.
  • BLOCK_DEVICE_VOLUME is the size of storage device attached to instance.
  • LAMBDA_FUNCTION_NAME lambda function to submit jobs to AWS Batch.

To override these configurations, use the cdk_context key in your custom config file. See our sample cloud config for reference.

Monitoring metrics for your instances on AWS

A variety of metrics are available for the EC2 instances that are launched during benchmarking. These can be accessed through the AWS Console by following this navigation path: CloudWatch -> All metrics -> AWS namespaces -> EC2. For a comprehensive list of these metrics, refer to the official AWS documentation.

In addition to the standard metrics, we also provide a custom metric for GPUUtilization. This can be found in the CloudWatch section under All metrics -> Custom namespaces -> EC2. Please note that the GPUUtilization metric is also updated every five minutes.

Evaluating benchmark runs

Tabular benchmark results can be evaluated using the tools in src/autogluon/bench/eval/. The evaluation logic will aggregate, clean, and produce evaluation results for runs stored in S3. In a future release, we intend to add evaluation support for multimodal benchmark results.

Evaluation Steps

Begin by setting up AWS credentials for the default profile for the AWS account that has the benchmark results in S3.

Step 1: Aggregate AMLB results on S3. After running the benchmark in AWS mode, take note of the benchmark_name with timestamp in {WORKING_DIR}/{root_dir}/{module}/{benchmark_name}_{timestamp}/{module}_cloud_configs.yaml and run the command below:

agbench aggregate-amlb-results {METRICS_BUCKET} {module} {benchmark_name} --constraint {constraint}

This will create a new file on S3 with this signature:

s3://{METRICS_BUCKET}/aggregated/{module}/{benchmark_name}/results_automlbenchmark_{constraint}_{benchmark_name}.csv

Currently, aggregation is also supported for multimodal benchmark results without the --constratint option.

For more details, run:

agbench aggregate-amlb-results --help

Step 2: Further clean the aggregated results.

If the file is still on S3 from the previous step, run:

agbench clean-amlb-results {benchmark_name} --results-dir-input s3://{METRICS_BUCKET}/aggregated/{module}/{benchmark_name}/ --benchmark-name-in-input-path --constraints constratint_1 --constraints constratint_2 --results-dir-output {results_dir_output} 
--out-path-prefix {out_path_prefix} --out-path-suffix {out_path_suffix}

where {results_dir_input} can also be a local directory. This will create a local file {results_dir_output}/{out_path_prefix}{benchmark_name}{out_path_suffix}.

For more details, run:

agbench clean-amlb-results --help

Step 3: Run evaluation on multiple cleaned files from Step 2

agbench evaluate-amlb-results --frameworks_run framework_1 --frameworks_run framework_2 --results-dir-input data/results/input/prepared/openml --paths file_name_1.csv --paths file_name_2.csv --no-clean-data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autogluon.bench-0.2.2.tar.gz (94.5 kB view hashes)

Uploaded Source

Built Distribution

autogluon.bench-0.2.2-py3-none-any.whl (94.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page