Skip to main content

A simple wrapper process around cloud service providers to run tools for the RAPIDS Accelerator for Apache Spark.

Project description

spark-rapids-user-tools

User tools to help with the adoption, installation, execution, and tuning of RAPIDS Accelerator for Apache Spark.

The wrapper improves end-user experience within the following dimensions:

  1. Qualification: Educate the CPU customer on the cost savings and acceleration potential of RAPIDS Accelerator for Apache Spark. The output shows a list of apps recommended for RAPIDS Accelerator for Apache Spark with estimated savings and speed-up.
  2. Bootstrap: Provide optimized RAPIDS Accelerator for Apache Spark configs based on GPU cluster shape. The output shows updated Spark config settings on driver node.
  3. Tuning: Tune RAPIDS Accelerator for Apache Spark configs based on initial job run leveraging Spark event logs. The output shows recommended per-app RAPIDS Accelerator for Apache Spark config settings.
  4. Diagnostics: Run diagnostic functions to validate the Dataproc with RAPIDS Accelerator for Apache Spark environment to make sure the cluster is healthy and ready for Spark jobs.

Getting started

Set up a Python environment with a version between 3.8 and 3.10

  1. Run the project in a virtual environment.

    $ python -m venv .venv
    $ source .venv/bin/activate
    
  2. Install spark-rapids-user-tools

    • Using released package.

      $ pip install spark-rapids-user-tools
      
    • Install from source.

      $ pip install -e .
      
    • Using wheel package built from the repo (see the build steps below).

      $ pip install <wheel-file>
      
  3. Make sure to install CSP SDK if you plan to run the tool wrapper.

Building from source

Set up a Python environment similar to the steps above.

  1. Run the provided build script to compile the project.
    $ ./build.sh
    

Usage and supported platforms

Please refer to spark-rapids-user-tools guide for details on how to use the tools and the platform.

What's new

Please refer to CHANGELOG.md for our latest changes.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file spark_rapids_user_tools-23.8.0-173_a051128-py3-none-any.whl.

File metadata

File hashes

Hashes for spark_rapids_user_tools-23.8.0-173_a051128-py3-none-any.whl
Algorithm Hash digest
SHA256 eeb5a7bf4cc72525fbf2f3bf02b6d619f91a301c866aa5bcb925d8c14fb1db3d
MD5 133782d224d53da7cf47c674f7184daf
BLAKE2b-256 ce0a4d7e4d1e89f480872a300c1a3b3ba7ccfdc8e9cfde5349a8ff4e5659a8b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page