Skip to main content

Airflow operators for PostgreSQL <-> CSV file transfers using COPY

Project description

airflow-postgres-csv

License License
PyPI python airflow PyPI Downloads
CI lint tests codecov

Airflow operators for bulk PostgreSQL <-> CSV transfers using COPY. Supports Airflow 2.9+ and Airflow 3.

Listed on the Apache Airflow Ecosystem page.

Operators

  • PostgresToCsvOperator - Run a SQL query and export results to a CSV file
  • CsvToPostgresOperator - Load a CSV file into a PostgreSQL table

Both use PostgreSQL's COPY command for maximum throughput.

Installation

pip install airflow-postgres-csv

Usage

from airflow_postgres_csv import PostgresToCsvOperator, CsvToPostgresOperator

# Export query results to CSV
export_task = PostgresToCsvOperator(
    task_id="export_users",
    conn_id="my_postgres",
    sql="SELECT * FROM users WHERE active = %(active)s",
    parameters={"active": True},
    csv_file_path="/tmp/users.csv",
)

# Load CSV into a table (with truncate)
import_task = CsvToPostgresOperator(
    task_id="import_users",
    conn_id="my_postgres",
    table_name="staging.users",
    csv_file_path="/tmp/users.csv",
    truncate=True,
)

SQL from file

The sql parameter supports multiple formats:

# Inline SQL
PostgresToCsvOperator(sql="SELECT * FROM users", ...)

# Relative path (loaded by Airflow from DAG folder or template_searchpath)
PostgresToCsvOperator(sql="sql/export_users.sql", ...)

# Absolute path (loaded directly)
PostgresToCsvOperator(sql="/opt/airflow/sql/export_users.sql", ...)

Gzip compression

Both operators support gzip compression for large files:

# Export to gzip
PostgresToCsvOperator(
    sql="SELECT * FROM large_table",
    csv_file_path="/tmp/data.csv.gz",
    compression="gzip",
    ...
)

# Import from gzip
CsvToPostgresOperator(
    csv_file_path="/tmp/data.csv.gz",
    compression="gzip",
    ...
)

Parameters

PostgresToCsvOperator

Parameter Description Default
conn_id Airflow Postgres connection ID required
csv_file_path Output file path (templated) required
sql SQL query string, or path to .sql file required
parameters Dict passed to cursor.mogrify {}
has_header Include CSV header row True
compression Compression format ("gzip" or None) None
timeout Query timeout in minutes 60
count_lines Count and log the number of lines in the CSV after writing True

CsvToPostgresOperator

Parameter Description Default
conn_id Airflow Postgres connection ID required
table_name Target table (templated, supports schema.table) required
csv_file_path Input file path (templated) required
columns Explicit column list None
has_header CSV has header row True
truncate Truncate table before loading False
compression Compression format ("gzip" or None) None
delimiter CSV delimiter ","
quote_char CSV quote character '"'
null_string String representing NULL ""
timeout Query timeout in minutes 60
count_lines Count and log the number of lines in the CSV before loading True

Development

Running tests

Tests can be run against both supported Airflow versions using tox:

pip install tox

tox -e airflow2   # test against Airflow 2.x
tox -e airflow3   # test against Airflow 3.x
tox               # run both

Each environment installs the correct Airflow and provider versions automatically — no manual dependency management needed.

Requirements

Airflow 2 Airflow 3
apache-airflow >=2.9, <3.0 >=3.0
apache-airflow-providers-postgres >=5.0, <6.0 >=6.0
Python 3.10 – 3.13 3.10 – 3.13

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_postgres_csv-0.4.0.tar.gz (201.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airflow_postgres_csv-0.4.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file airflow_postgres_csv-0.4.0.tar.gz.

File metadata

  • Download URL: airflow_postgres_csv-0.4.0.tar.gz
  • Upload date:
  • Size: 201.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for airflow_postgres_csv-0.4.0.tar.gz
Algorithm Hash digest
SHA256 f22662a98bad92670ae24dee167229d806b3a555318792e8a6ea403467a512c7
MD5 83a87ef5fc220fbbe80f4edc6d77a721
BLAKE2b-256 474115a0ef6b7a6260acef79cee419fafbeddff229cb34f6336328b9c9a6da39

See more details on using hashes here.

Provenance

The following attestation bundles were made for airflow_postgres_csv-0.4.0.tar.gz:

Publisher: publish.yml on Redevil10/airflow-postgres-csv

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file airflow_postgres_csv-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_postgres_csv-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 093efe84291794ec13baa1fcc072dff66a3a18959fb2034e18a5fd7f22719dc7
MD5 b2bb90f1003761a8eddcb27619466636
BLAKE2b-256 2b06b6bb9d24b8ee372dc7ab8b85be34498878c9d8296b65a56813e4c53bae2f

See more details on using hashes here.

Provenance

The following attestation bundles were made for airflow_postgres_csv-0.4.0-py3-none-any.whl:

Publisher: publish.yml on Redevil10/airflow-postgres-csv

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page