Skip to main content

Light IO transforms for Postgres read/write in Apache Beam pipelines.

Project description

beam-postgres

PyPI Supported Versions

Light IO transforms for Postgres read/write in Apache Beam pipelines.

Goal

The project aims to provide highly performant and customizable transforms and is not intended to support many different SQL database engines.

Features

  • ReadAllFromPostgres, ReadFromPostgres`` and WriteToPostgres` transforms
  • Records can be mapped to tuples, dictionaries or dataclasses
  • Reads and writes are in configurable batches

Usage

Printing data from the database table:

import apache_beam as beam
from psycopg.rows import dict_row

from beam_postgres.io import ReadAllFromPostgres

with beam.Pipeline() as p:
    data = p | "Reading example records from database" >> ReadAllFromPostgres(
        "host=localhost dbname=examples user=postgres password=postgres",
        "select id, data from source",
        dict_row,
    )
    data | "Writing to stdout" >> beam.Map(print)

Writing data to the database table:

from dataclasses import dataclass

import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

from beam_postgres.io import WriteToPostgres


@dataclass
class Example:
    data: str


with beam.Pipeline(options=PipelineOptions()) as p:
    data = p | "Reading example records" >> beam.Create(
        [
            Example("example1"),
            Example("example2"),
        ]
    )
    data | "Writing example records to database" >> WriteToPostgres(
        "host=localhost dbname=examples user=postgres password=postgres",
        "insert into sink (data) values (%s)",
    )

See here for more examples.

Reading in batches

There may be situations when you have so much data that it will not fit into the memory - then you want to read your table data in batches. You can see an example code here (the code reads records in a batches of 1).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beam-postgres-0.4.1.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beam_postgres-0.4.1-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file beam-postgres-0.4.1.tar.gz.

File metadata

  • Download URL: beam-postgres-0.4.1.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.9

File hashes

Hashes for beam-postgres-0.4.1.tar.gz
Algorithm Hash digest
SHA256 f5010985e98f864e7f34568bae4961cb5e063788322fa058c692ce1801923b5e
MD5 a02d1e16f26b63491e9c17904bacac83
BLAKE2b-256 f3e797478823e7a5d0c7c65fc00eb11121cc12f4e23b1709554831b159818fdc

See more details on using hashes here.

File details

Details for the file beam_postgres-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: beam_postgres-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.9

File hashes

Hashes for beam_postgres-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1d73cf35506548c665101c7f19c21ce0152ea795cedb1dc5392588a6ff2840e7
MD5 54937a5ca289e38f3f401cf8336a6d80
BLAKE2b-256 a73da72457549c731ac9cc86761317a0c1b1a6bdb84c4e05ae1c44517c37bbed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page