Skip to main content

[DEPRECATED - no longer maintained] An Apache Airflow provider package built by Astronomer to integrate with Ray.

Project description

⚠️ Discontinuation of project

This project is no longer actively maintained by Astronomer. Development has been paused and we are not accepting new contributions, bug fixes or releases. The code is still here for you to explore, fork and adapt under the terms of its license. Please note that it may not work with the latest dependencies or platforms, and it could contain security vulnerabilities. Astronomer can't offer guarantees or warranties for its use.

Google Cloud alternative (partial): If you run Ray on Google Cloud, the official Apache Airflow Google provider (apache-airflow-providers-google) ships Ray operators that cover a subset of what this provider does:

This provider is Kubernetes-generic, so the Google operators are a Google Cloud-specific alternative, not a drop-in replacement.

If you're interested in adopting or stewarding this project, we'd be happy to chat — reach us at oss@astronomer.io. Thanks for being part of the open-source journey and helping keep great ideas alive!


Ray provider

:books: Docs   |   :rocket: Getting Started   |   :speech_balloon: Slack (#airflow-ray)  |   :fire: Contribute  

Orchestrate your Ray jobs using Apache Airflow® combining Airflow's workflow management with Ray's distributed computing capabilities.

Benefits of using this provider include:

  • Integration: Incorporate Ray jobs into Airflow DAGs for unified workflow management.
  • Distributed computing: Use Ray's distributed capabilities within Airflow pipelines for scalable ETL, LLM fine-tuning etc.
  • Monitoring: Track Ray job progress through Airflow's user interface.
  • Dependency management: Define and manage dependencies between Ray jobs and other tasks in DAGs.
  • Resource allocation: Run Ray jobs alongside other task types within a single pipeline.

Table of Contents

Quickstart

Check out the Getting Started guide in our docs. Sample DAGs are available at example_dags/.

Sample DAGs

Example 1: Using @ray.task for job life cycle

The below example showcases how to use the @ray.task decorator to manage the full lifecycle of a Ray cluster: setup, job execution, and teardown.

This approach is ideal for jobs that require a dedicated, short-lived cluster, optimizing resource usage by cleaning up after task completion

https://github.com/astronomer/astro-provider-ray/blob/bd6d847818be08fae78bc1e4c9bf3334adb1d2ee/example_dags/ray_taskflow_example.py#L1-L57

Example 2: Using SetupRayCluster, SubmitRayJob & DeleteRayCluster

This example shows how to use separate operators for cluster setup, job submission, and teardown, providing more granular control over the process.

This approach allows for more complex workflows involving Ray clusters.

Key Points:

  • Uses SetupRayCluster, SubmitRayJob, and DeleteRayCluster operators separately.
  • Allows for multiple jobs to be submitted to the same cluster before deletion.
  • Demonstrates how to pass cluster information between tasks using XCom.

This method is ideal for scenarios where you need fine-grained control over the cluster lifecycle, such as running multiple jobs on the same cluster or keeping the cluster alive for a certain period.

https://github.com/astronomer/astro-provider-ray/blob/bd6d847818be08fae78bc1e4c9bf3334adb1d2ee/example_dags/setup-teardown.py#L1-L44

Getting Involved

Platform Purpose Est. Response time
Discussion Forum General inquiries and discussions < 3 days
GitHub Issues Bug reports and feature requests < 1-2 days
Slack Quick questions and real-time chat 12 hrs

Changelog

We follow Semantic Versioning for releases. Check CHANGELOG.rst for the latest changes.

Contributing Guide

All contributions, bug reports, bug fixes, documentation improvements, enhancements are welcome.

A detailed overview on how to contribute can be found in the Contributing Guide.

License

Apache 2.0 License

Privacy Notice

This project follows Astronomer's Privacy Policy <https://www.astronomer.io/privacy/>_

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astro_provider_ray-0.4.0.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

astro_provider_ray-0.4.0-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file astro_provider_ray-0.4.0.tar.gz.

File metadata

  • Download URL: astro_provider_ray-0.4.0.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for astro_provider_ray-0.4.0.tar.gz
Algorithm Hash digest
SHA256 a130803489cbb03daa03a8f428d3c9fc13066d12e062ad99a788baf9fff6e5f6
MD5 157236318393e2389b503e78f2145171
BLAKE2b-256 8ac9fc7215a7356e8297e5d52d551cd27d2e2911e323ed76942c74733b96d94e

See more details on using hashes here.

File details

Details for the file astro_provider_ray-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for astro_provider_ray-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8f8c13e0129cc7c72709fbf33b9e1fd4c08fc22fb4e5f86667b87ab060a3c4b6
MD5 38c6ad7d3fd7c219f47a2b8c2099c2bd
BLAKE2b-256 ed61d417b4d2d1477631c2433792394e464aea18fd491f2ec86157919b2b74e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page