Skip to main content

Provider for Apache Airflow. Implements apache-airflow-providers-apache-spark package

Project description

Package apache-airflow-providers-apache-spark

Release: 4.0.0rc1

Apache Spark

Provider package

This is a provider package for apache.spark provider. All classes for this provider package are in airflow.providers.apache.spark python package.

You can find package information and changelog for the provider in the documentation.

Installation

You can install this package on top of an existing Airflow 2 installation (see Requirements below for the minimum Airflow version supported) via pip install apache-airflow-providers-apache-spark

The package supports the following python versions: 3.7,3.8,3.9,3.10

Requirements

PIP package

Version required

apache-airflow

>=2.3.0

pyspark

Changelog

4.0.0

This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.

Breaking changes

The spark-binary connection extra could be set to any binary, but with 4.0.0 version only two values are allowed for it spark-submit and spark2-submit.

The spark-home connection extra is not allowed any more - the binary should be available on the PATH in order to use SparkSubmitHook and SparkSubmitOperator.

  • Remove custom spark home and custom binaries for spark (#27646)

Misc

  • Move min airflow version to 2.3.0 for all providers (#27196)

3.0.0

Breaking changes

Bug Fixes

  • Add typing for airflow/configuration.py (#23716)

  • Fix backwards-compatibility introduced by fixing mypy problems (#24230)

Misc

  • AIP-47 - Migrate spark DAGs to new design #22439 (#24210)

  • chore: Refactoring and Cleaning Apache Providers (#24219)

2.1.3

Bug Fixes

  • Fix mistakenly added install_requires for all providers (#22382)

2.1.2

Misc

  • Add Trove classifiers in PyPI (Framework :: Apache Airflow :: Provider)

2.1.1

Bug Fixes

  • fix param rendering in docs of SparkSubmitHook (#21788)

Misc

  • Support for Python 3.10

2.1.0

Features

  • Add more SQL template fields renderers (#21237)

  • Add optional features in providers. (#21074)

2.0.3

Bug Fixes

  • Ensure Spark driver response is valid before setting UNKNOWN status (#19978)

2.0.2

Bug Fixes

  • fix bug of SparkSql Operator log going to infinite loop. (#19449)

2.0.1

Misc

  • Optimise connection importing for Airflow 2.2.0

2.0.0

Breaking changes

  • Auto-apply apply_default decorator (#15667)

Bug fixes

  • Make SparkSqlHook use Connection (#15794)

1.0.3

Bug fixes

  • Fix 'logging.exception' redundancy (#14823)

1.0.2

Bug fixes

  • Use apache.spark provider without kubernetes (#14187)

1.0.1

Updated documentation and readme files.

1.0.0

Initial version of the provider.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page