Postgres/MySQL/MariaDB to Elasticsearch/OpenSearch sync
Project description
PostgreSQL/MySQL/MariaDB to Elasticsearch/OpenSearch sync
PGSync is a middleware for syncing data from PostgreSQL, MySQL, or MariaDB to Elasticsearch or OpenSearch.
Keep your relational database as the source of truth and expose structured denormalized documents in your search engine.
Key Features
Real-time sync via logical decoding (PostgreSQL) or binary log (MySQL/MariaDB)
Denormalize complex relational data into nested search documents
JSON schema-based configuration
Support for one-to-one, one-to-many relationships
Plugin system for document transformation
Multiple operation modes: daemon, polling, or direct WAL streaming
Requirements
Python 3.9+
PostgreSQL 9.6+ or MySQL 8.0.0+ or MariaDB 12.0.0+
Elasticsearch 6.3.1+ or OpenSearch 1.3.7+
Installation
Install from PyPI:
pip install pgsync
Database Setup
PostgreSQL
Enable logical decoding in your PostgreSQL configuration (postgresql.conf):
wal_level = logical
max_replication_slots = 1
MySQL / MariaDB
Enable binary logging in your MySQL/MariaDB configuration (my.cnf):
server-id = 1
log_bin = mysql-bin
binlog_row_image = FULL
binlog_expire_logs_seconds = 604800
Create a replication user:
CREATE USER 'replicator'@'%' IDENTIFIED WITH mysql_native_password BY 'password';
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;
Configuration
Create a JSON schema file (e.g., schema.json) defining your sync mapping:
[
{
"database": "book",
"index": "book",
"nodes": {
"table": "book",
"columns": ["isbn", "title", "description"],
"children": [
{
"table": "publisher",
"columns": ["name"],
"relationship": {
"variant": "object",
"type": "one_to_one"
}
},
{
"table": "author",
"label": "authors",
"columns": ["name", "date_of_birth"],
"relationship": {
"variant": "object",
"type": "one_to_many",
"through_tables": ["book_author"]
}
}
]
}
}
]
See the examples directory for more schema examples (airbnb, social, rental, etc.).
Environment Variables
Configure PGSync via environment variables:
# Schema
SCHEMA='/path/to/schema.json'
# PostgreSQL
PG_HOST=localhost
PG_PORT=5432
PG_USER=postgres
PG_PASSWORD=*****
# Elasticsearch / OpenSearch
ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200
# Redis (optional in WAL mode)
REDIS_HOST=localhost
REDIS_PORT=6379
Running
Bootstrap (run once to set up triggers and replication slots):
bootstrap --config schema.json
Run as daemon:
pgsync --config schema.json --daemon
Links
Documentation: https://pgsync.com
Source Code: https://github.com/toluaina/pgsync
Bug Reports: https://github.com/toluaina/pgsync/issues
Sponsor: https://github.com/sponsors/toluaina
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pgsync-7.0.5.tar.gz.
File metadata
- Download URL: pgsync-7.0.5.tar.gz
- Upload date:
- Size: 138.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4e6637a3383f800f9bac75b1276a972c384935eb0589b0238db21803570e380
|
|
| MD5 |
648b0985391f90015f74f801d394a3bf
|
|
| BLAKE2b-256 |
fbb08f1c4440e9251b0bb76fe315fa4938ee77cedbc5c791ff694eb683e6fe31
|
Provenance
The following attestation bundles were made for pgsync-7.0.5.tar.gz:
Publisher:
python-publish.yml on toluaina/pgsync
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pgsync-7.0.5.tar.gz -
Subject digest:
f4e6637a3383f800f9bac75b1276a972c384935eb0589b0238db21803570e380 - Sigstore transparency entry: 790576775
- Sigstore integration time:
-
Permalink:
toluaina/pgsync@1a3f5717c645ff1eca2428c14591fdd4cdddd9fc -
Branch / Tag:
refs/tags/7.0.5 - Owner: https://github.com/toluaina
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@1a3f5717c645ff1eca2428c14591fdd4cdddd9fc -
Trigger Event:
release
-
Statement type:
File details
Details for the file pgsync-7.0.5-py3-none-any.whl.
File metadata
- Download URL: pgsync-7.0.5-py3-none-any.whl
- Upload date:
- Size: 79.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e255f5bd67f3212e48666ccef9bc53240d144d519a28ee3f2a69dcaa8a1a3e84
|
|
| MD5 |
a924cf632ce6b89a9a47a4427ba2d039
|
|
| BLAKE2b-256 |
ab4274d084113238cb327429dfe86dfed8d4fbb3d268ae8c2e81c7d7902b03a8
|
Provenance
The following attestation bundles were made for pgsync-7.0.5-py3-none-any.whl:
Publisher:
python-publish.yml on toluaina/pgsync
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pgsync-7.0.5-py3-none-any.whl -
Subject digest:
e255f5bd67f3212e48666ccef9bc53240d144d519a28ee3f2a69dcaa8a1a3e84 - Sigstore transparency entry: 790576777
- Sigstore integration time:
-
Permalink:
toluaina/pgsync@1a3f5717c645ff1eca2428c14591fdd4cdddd9fc -
Branch / Tag:
refs/tags/7.0.5 - Owner: https://github.com/toluaina
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@1a3f5717c645ff1eca2428c14591fdd4cdddd9fc -
Trigger Event:
release
-
Statement type: