Change Data Capture (CDC) library for Postgres
Project description
pypgcdc
Change Data Capture (CDC) tool for Postgres
This project is a Python implementation of a Postgres CDC client. It is intended to be used as a library for building CDC applications. It is not intended to be used as a standalone application but there is a running example useful as a starting point.
The main problem with CDC and Postgres in practice is that Postgres runs with a hot stand-by used for failover in case of a primary node failure. The change is usually behind a DNS record so clients just re-connect and run fine. The problem is that the replication slot is not copied over to the new primary node and the CDC client will either fail to start (no slot) or create a slot but potentially miss some data (slot created after some data was written).
You can, of course, create a new slot, do the initial sync where you copy over all the available data, and then start replicating as usual. The problem is that the initial sync can take a long time if you have big tables.
This library doesn't really solve the problem, but it provides a way to react to the failover event. The idea is
that you add triggers to the tables you want to replicate (published tables) and store the inserts/updates/deletes
in separate tables (log tables).
The initial sync when you start your CDC app can just select *
from the published tables and then tail the changes
on these tables. You will need some persistence to store the commits you have already processed so when a failover
event occurs, you can use the last processed commit to select any changes from the log tables and then carry on
tailing the published tables. This way you can avoid the initial sync and catch up with any pending changes faster.
Env Vars
- PYPGCDC_DSN, default "postgres://postgres:postgrespw@localhost:5432/test" -- Postgres connection string
- PYPGCDC_SLOT, default "test_slot" -- Postgres replication slot name
- PYPGCDC_PUBLICATION, default "test_publication" -- Postgres publication name
- PYPGCDC_LSN, default 0 -- Postgres LSN to start from
- PYPGCDC_VERBOSE, default "False" -- A flag used to control print output of the example datastore. Use one of ("1", "true", "yes") to enable more verbose output.
Example
The library comes with an example which can be used to see how it works. The example requires a running Postgres database with some tables and an existing publication. The example will create a replication slot if it doesn't exist and start tailing the changes. The example will print the changes to stdout.
Once you finish with the example, remember to drop the replication slot. Leaving an unused replication slot is dangerous as the WAL files used for replication might not be removed, and you can run out of disk space (not an issue on your local computer but quite a problem on your production servers...).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file python-postgres-cdc-0.0.0rc2.tar.gz
.
File metadata
- Download URL: python-postgres-cdc-0.0.0rc2.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64de0862c4a253c6acf5fecfb4089aecc833c5399ce71cf5ac3d4c582482d129 |
|
MD5 | 48f9c3b62784f70799306c0fcf95c155 |
|
BLAKE2b-256 | ac900c79e44c7a2707964ea74e81330d471238844728628b018534bf882fed42 |
File details
Details for the file python_postgres_cdc-0.0.0rc2-py3-none-any.whl
.
File metadata
- Download URL: python_postgres_cdc-0.0.0rc2-py3-none-any.whl
- Upload date:
- Size: 20.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6bba92112459e7f53abc2ee303ac7c6d6cdb3bf88bf1d4d23d8c177425764be0 |
|
MD5 | 9f70a831513c6a832183b4547aa39b2a |
|
BLAKE2b-256 | 54e5d688d5da2f80d35939fe8ba251d7d217c37669cca9b0e5f69d7a25cfc10d |