Turn AML cluster into Ray
Project description
Ray on Azure ML
This package simplifies setup of Ray and Ray's components such as DaskOnRay, SparkOnRay, Ray Machine Learning in Azure ML for your data science projects. This allows you to use AML compute as Ray/Dask cluster in both interactive and job modes.
Architecture
Prerequistes
Before you run sample, please check followings.
1. Configure Azure Environment
For Interactive use at your compute instance, create a compute cluster in the same vnet where your compute instance is, then run this to get handle to the ray cluster
Check list
[ ] Azure Machine Learning Workspace
[ ] Virtual network/Subnet
[ ] Create Compute Instance in the Virtual Network
[ ] Create Compute Cluster in the same Virtual Network
2. Select kernel
Use azureml_py38
conda environment from (Jupyter) Notebook
in Azure Machine Learning Studio.
Note: Due to Conda env issue,VSCode is not supported yet when using the compute instance as head node when using with ci_is_head = True in getRay() method
3. Install library
pip install --upgrade ray-on-aml
Installing this library will also install ray[default]==1.9.1, pyarrow>= 5.0.0, dask[complete]==2021.12.0, adlfs==2021.10.0 and fsspec==2021.10.1
4. Run ray-on-aml
Run in interactive mode in a Compute Instance notebook
from ray_on_aml.core import Ray_On_AML
ws = Workspace.from_config()
ray_on_aml =Ray_On_AML(ws=ws, compute_cluster ="Name_of_Compute_Cluster")
ray = ray_on_aml.getRay() # may take around 7 or more mintues
Note that by default, one of the nodes in the remote AML compute cluster is used as head node and the remaining are worker nodes. But if you want to use your current compute instance as head node and all nodes in the remote compute cluster as workers Then simply specify ci_is_head=True). To install additional library, use additional_pip_packages and additional_conda_packages parameters.
ray_on_aml =Ray_On_AML(ws=ws, compute_cluster ="d15-v2", additional_pip_packages=['torch==1.10.0', 'torchvision', 'sklearn'], maxnode=4)
ray = ray_on_aml.getRay(ci_is_head=True)
Advanced usage:There are two arguments to Ray_On_AML() object initilization with to specify base configuration for the library with following default values. Although it's possible, you should not change the default values of base_conda_dep and base_pip_dep as it may break the package. Only do so when you need to customize the cluster default configuration such as ray version.
Ray_On_AML(ws=ws, compute_cluster ="Name_of_Compute_Cluster",base_conda_dep =['adlfs==2021.10.0','pip'],base_pip_dep = ['ray[tune]==1.9.1', 'xgboost_ray==0.1.5', 'dask==2021.12.0','pyarrow >= 5.0.0','fsspec==2021.10.1'])
For use in an Azure ML job, include ray_on_aml as a pip dependency and inside your script, do this to get ray
from ray_on_aml.core import Ray_On_AML
ray_on_aml =Ray_On_AML()
ray = ray_on_aml.getRay()
if ray: #in the headnode
pass
#logic to use Ray for distributed ML training, tunning or distributed data transformation with Dask
else:
print("in worker node")
5. Shutdown ray cluster
To shutdown cluster you must run following.
ray_on_aml.shutdown()
Check out quick start examples to learn more
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ray_on_aml-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5fad08f30d95845801ac7c7f9efa37204779fd0f93010dac56039da7004c9f8 |
|
MD5 | 0b1786e7e0ac77554c000119449cfef9 |
|
BLAKE2b-256 | 10e3d7b7e6f751f90f3e7f76a2d39bf2ccc523e0859872683ec3e6b163079a8c |