A powerful database replication and synchronization tool
Project description
Database Sync Service
A high-performance, resilient PostgreSQL data replication service designed for selective column synchronization and automated schema evolution.
Overview
This service facilitates the replication of specific tables and columns from a primary PostgreSQL database to one or more replica databases. It is built for scenarios where you need to maintain specialized read replicas or sync data across microservices while maintaining strictly controlled schemas.
Configuration
The service is configured via config/sync.yaml.
primary_db:
url: postgresql://user:pass@localhost:5432/primary_db?sslmode=disable
replica_dbs:
- name: replica_1
url: postgresql://user:pass@localhost:5433/replica_db?sslmode=disable
tables:
users:
primary_key: id
mode: upsert # Options: insert | upsert
batch_size: 10000
# Columns to extract and maintain
columns_to_sync:
- user_name
- email
- metadata
- updated_at
# Define if primary and replica column names differ
column_mapping:
# primary_col: replica_col
user_name: username
# Columns to update on conflict (if mode is upsert)
conflict_resolution:
update_columns:
- username
- email
- updated_at
checksum:
enabled: true
columns:
- email
- username
orders:
primary_key: order_id
mode: insert
batch_size: 5000
columns_to_sync:
- customer_id
- total_amount
- status
Getting Started
Prerequisites
- Python 3.10+
- PostgreSQL instances (Primary and Replica)
Installation
You can install syncset-db using pip:
pip install syncset-db
Running the Service
You can run the service using the globally installed syncset command or directly via the script.
Using the CLI tool:
# Run with a custom configuration file (Recommended)
syncset --file=sync.yaml
# Run with custom config and dry-run mode
syncset --file=sync.yaml --dry-run
If you don't provide a file, it defaults to config/sync.yaml.
Using Python directly:
# Start Sync
python3 cli.py --file=sync.yaml
# Dry Run
python3 cli.py --file=sync.yaml --dry-run
Key Features
- Selective Replication: Sync only the tables and columns you need.
- Incremental Sync: Tracks synchronization state via high-watermark primary keys to ensure only new or modified data is processed.
- Data Integrity: Optional checksum-based validation to ensure rows are truly identical before skipping them.
- Multi-Replica Support: Synchronize the same primary data to multiple independent targets in parallel.
Architecture
The synchronization follows a batched extraction and load pattern:
- Validate: Perform checksum comparisons (if enabled) against existing replica data to minimize redundant writes.
- Load: Execute bulk upserts or inserts into the replica database.
- State Update: Persist the highest processed primary key to
.sync_state.json.
State Management
Replication progress is stored in .sync_state.json. To re-trigger a full synchronization for a specific table, simply remove its entry from this file or delete the file entirely.
Future Plans
- CDC Support: Implement logical decoding to enable near real-time synchronization.
- Monitoring: Integration with Prometheus and Grafana for health and performance monitoring.
- Web Dashboard: A lightweight management UI to monitor sync progress and adjust configuration visually.
- Multi-Database Support: Extend beyond PostgreSQL to support MySQL, SQLite, and MongoDB as targets.
- Compression: Add support for data compression during transit for high-latency connections.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file syncset_db-0.2.0.tar.gz.
File metadata
- Download URL: syncset_db-0.2.0.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69f3db0955c5dd0530a254da1e9e3fa7b931ce33bedc5641ba25022a10e64250
|
|
| MD5 |
35d264ab3df73ebb71e80e08c5afec59
|
|
| BLAKE2b-256 |
25028caf5b7155715ad8a70bd48f1b96d2a3a128b0dad8fa180ab0353f1f9ba1
|
Provenance
The following attestation bundles were made for syncset_db-0.2.0.tar.gz:
Publisher:
publish.yml on MohamedAklamaash/syncset
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
syncset_db-0.2.0.tar.gz -
Subject digest:
69f3db0955c5dd0530a254da1e9e3fa7b931ce33bedc5641ba25022a10e64250 - Sigstore transparency entry: 890963406
- Sigstore integration time:
-
Permalink:
MohamedAklamaash/syncset@149a3c58ac3a5bda653b945ca0b028a4d82a08d6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/MohamedAklamaash
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@149a3c58ac3a5bda653b945ca0b028a4d82a08d6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file syncset_db-0.2.0-py3-none-any.whl.
File metadata
- Download URL: syncset_db-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
264400387afc09b66de5bcc817ac763ddc27d3802df5acbebf024f30ada022cf
|
|
| MD5 |
e968dabd7a124d4738847f859821dcfd
|
|
| BLAKE2b-256 |
9bc44630e3666c7d06ea16bb917470ce2c2c1abb4280f757360fd5dfaf3f0e02
|
Provenance
The following attestation bundles were made for syncset_db-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on MohamedAklamaash/syncset
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
syncset_db-0.2.0-py3-none-any.whl -
Subject digest:
264400387afc09b66de5bcc817ac763ddc27d3802df5acbebf024f30ada022cf - Sigstore transparency entry: 890963468
- Sigstore integration time:
-
Permalink:
MohamedAklamaash/syncset@149a3c58ac3a5bda653b945ca0b028a4d82a08d6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/MohamedAklamaash
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@149a3c58ac3a5bda653b945ca0b028a4d82a08d6 -
Trigger Event:
release
-
Statement type: