Skip to main content

Apache Airflow Operator exporting AWS Cost Explorer data to local file or S3

Project description

Airflow AWS Cost Explorer Plugin

A plugin for Apache Airflow that allows you to export AWS Cost Explorer as S3 metrics to local file or S3 in Parquet, JSON, or CSV format.

System Requirements

  • Airflow Versions
    • 1.10.3 or newer
  • pyarrow or fastparquet (optional, for writing Parquet files)

Deployment Instructions

  1. Install the plugin

    pip install airflow-aws-cost-explorer

  2. Optional for writing Parquet files - Install pyarrow or fastparquet

    pip install pyarrow

    or

    pip install fastparquet

  3. Restart the Airflow Web Server

  4. Configure the AWS connection (Conn type = 'aws')

  5. Optional for S3 - Configure the S3 connection (Conn type = 's3')

Operators

AWSCostExplorerToS3Operator

    :param day:             Date to be exported as string in YYYY-MM-DD format or date/datetime instance (default: yesterday)
    :type day:              str, date or datetime
    :param aws_conn_id:     Cost Explorer AWS connection id (default: aws_default)
    :type aws_conn_id:      str
    :param region_name:     Cost Explorer AWS Region
    :type region_name:      str
    :param s3_conn_id:      Destination S3 connection id (default: s3_default)
    :type s3_conn_id:       str
    :param s3_bucket:       Destination S3 bucket
    :type s3_bucket:        str
    :param s3_key:          Destination S3 key
    :type s3_key:           str
    :param file_format:     Destination file format (parquet, json or csv default: parquet)
    :type file_format:      str or FileFormat
    :param metrics:         Metrics (default: UnblendedCost, BlendedCost)
    :type metrics:          list

AWSCostExplorerToLocalFileOperator

    :param day:             Date to be exported as string in YYYY-MM-DD format or date/datetime instance (default: yesterday)
    :type day:              str, date or datetime
    :param aws_conn_id:     Cost Explorer AWS connection id (default: aws_default)
    :type aws_conn_id:      str
    :param region_name:     Cost Explorer AWS Region
    :type region_name:      str
    :param destination:     Destination file complete path
    :type destination:      str
    :param file_format:     Destination file format (parquet, json or csv default: parquet)
    :type file_format:      str or FileFormat
    :param metrics:         Metrics (default: UnblendedCost, BlendedCost)
    :type metrics:          list

AWSBucketSizeToS3Operator

    :param day:             Date to be exported as string in YYYY-MM-DD format or date/datetime instance (default: yesterday)
    :type day:              str, date or datetime
    :param aws_conn_id:     Cost Explorer AWS connection id (default: aws_default)
    :type aws_conn_id:      str
    :param region_name:     Cost Explorer AWS Region
    :type region_name:      str
    :param s3_conn_id:      Destination S3 connection id (default: s3_default)
    :type s3_conn_id:       str
    :param s3_bucket:       Destination S3 bucket
    :type s3_bucket:        str
    :param s3_key:          Destination S3 key
    :type s3_key:           str
    :param file_format:     Destination file format (parquet, json or csv default: parquet)
    :type file_format:      str or FileFormat
    :param metrics:         Metrics (default: bucket_size, number_of_objects)
    :type metrics:          list

AWSBucketSizeToLocalFileOperator

    :param day:             Date to be exported as string in YYYY-MM-DD format or date/datetime instance (default: yesterday)
    :type day:              str, date or datetime
    :param aws_conn_id:     Cost Explorer AWS connection id (default: aws_default)
    :type aws_conn_id:      str
    :param region_name:     Cost Explorer AWS Region
    :type region_name:      str
    :param destination:     Destination file complete path
    :type destination:      str
    :param file_format:     Destination file format (parquet, json or csv default: parquet)
    :type file_format:      str or FileFormat
    :param metrics:         Metrics (default: bucket_size, number_of_objects)
    :type metrics:          list

Example

    #!/usr/bin/env python
    import airflow
    from airflow import DAG
    from airflow_aws_cost_explorer import AWSCostExplorerToLocalFileOperator
    from datetime import timedelta

    default_args = {
        'owner': 'airflow',
        'depends_on_past': False,
        'start_date': airflow.utils.dates.days_ago(1),
        'email': ['airflow@example.com'],
        'email_on_failure': False,
        'email_on_retry': False,
        'retries': 1,
        'retry_delay': timedelta(minutes=30)
    }

    dag = DAG('cost_explorer',
        default_args=default_args,
        schedule_interval=None,
        concurrency=1,
        max_active_runs=1,
        catchup=False
    )

    aws_cost_explorer_to_file = AWSCostExplorerToLocalFileOperator(
        task_id='aws_cost_explorer_to_file',
        day='{{ yesterday_ds }}',
        destination='/tmp/{{ yesterday_ds }}.parquet',
        file_format='parquet',
        dag=dag)

    if __name__ == "__main__":
        dag.cli()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_aws_cost_explorer-1.3.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

airflow_aws_cost_explorer-1.3.0-py2.py3-none-any.whl (16.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file airflow_aws_cost_explorer-1.3.0.tar.gz.

File metadata

  • Download URL: airflow_aws_cost_explorer-1.3.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.16

File hashes

Hashes for airflow_aws_cost_explorer-1.3.0.tar.gz
Algorithm Hash digest
SHA256 de2eb066725bcf940295ab00b37ab846e2329526f2415f95a667217bade490ef
MD5 65790ef4f949c6905357d272adcfd81c
BLAKE2b-256 68d9b35438627532ee3f8e7307b553e30056ea6a967605298c65f0b980a8d87c

See more details on using hashes here.

File details

Details for the file airflow_aws_cost_explorer-1.3.0-py2.py3-none-any.whl.

File metadata

  • Download URL: airflow_aws_cost_explorer-1.3.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.16

File hashes

Hashes for airflow_aws_cost_explorer-1.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d3896bc7c56e3d2db9181eb75de188edf049aa1185e1aaba5a2018523634fa0a
MD5 92ece09f957b85244350b7815809d8e7
BLAKE2b-256 0e8d4e753a4eb62965033e9db19757e23832ee10db8d41c1bc0a92cd6a366ad2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page