Skip to main content

Postgres/MySQL/MariaDB to Elasticsearch/OpenSearch sync

Project description

PyPI Version Python Versions License Downloads

PostgreSQL/MySQL/MariaDB to Elasticsearch/OpenSearch sync

PGSync is a middleware for syncing data from PostgreSQL, MySQL, or MariaDB to Elasticsearch or OpenSearch.

Keep your relational database as the source of truth and expose structured denormalized documents in your search engine.

Key Features

  • Real-time sync via logical decoding (PostgreSQL) or binary log (MySQL/MariaDB)

  • Denormalize complex relational data into nested search documents

  • JSON schema-based configuration

  • Support for one-to-one, one-to-many relationships

  • Plugin system for document transformation

  • Multiple operation modes: daemon, polling, or direct WAL streaming

Requirements

Installation

Install from PyPI:

pip install pgsync

Database Setup

PostgreSQL

Enable logical decoding in your PostgreSQL configuration (postgresql.conf):

wal_level = logical
max_replication_slots = 1

MySQL / MariaDB

Enable binary logging in your MySQL/MariaDB configuration (my.cnf):

server-id = 1
log_bin = mysql-bin
binlog_row_image = FULL
binlog_expire_logs_seconds = 604800

Create a replication user:

CREATE USER 'replicator'@'%' IDENTIFIED WITH mysql_native_password BY 'password';
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;

Configuration

Create a JSON schema file (e.g., schema.json) defining your sync mapping:

[
    {
        "database": "book",
        "index": "book",
        "nodes": {
            "table": "book",
            "columns": ["isbn", "title", "description"],
            "children": [
                {
                    "table": "publisher",
                    "columns": ["name"],
                    "relationship": {
                        "variant": "object",
                        "type": "one_to_one"
                    }
                },
                {
                    "table": "author",
                    "label": "authors",
                    "columns": ["name", "date_of_birth"],
                    "relationship": {
                        "variant": "object",
                        "type": "one_to_many",
                        "through_tables": ["book_author"]
                    }
                }
            ]
        }
    }
]

See the examples directory for more schema examples (airbnb, social, rental, etc.).

Environment Variables

Configure PGSync via environment variables:

# Schema
SCHEMA='/path/to/schema.json'

# PostgreSQL
PG_HOST=localhost
PG_PORT=5432
PG_USER=postgres
PG_PASSWORD=*****

# Elasticsearch / OpenSearch
ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

# Redis (optional in WAL mode)
REDIS_HOST=localhost
REDIS_PORT=6379

Running

Bootstrap (run once to set up triggers and replication slots):

bootstrap --config schema.json

Run as daemon:

pgsync --config schema.json --daemon

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgsync-7.0.5.tar.gz (138.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgsync-7.0.5-py3-none-any.whl (79.1 kB view details)

Uploaded Python 3

File details

Details for the file pgsync-7.0.5.tar.gz.

File metadata

  • Download URL: pgsync-7.0.5.tar.gz
  • Upload date:
  • Size: 138.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pgsync-7.0.5.tar.gz
Algorithm Hash digest
SHA256 f4e6637a3383f800f9bac75b1276a972c384935eb0589b0238db21803570e380
MD5 648b0985391f90015f74f801d394a3bf
BLAKE2b-256 fbb08f1c4440e9251b0bb76fe315fa4938ee77cedbc5c791ff694eb683e6fe31

See more details on using hashes here.

Provenance

The following attestation bundles were made for pgsync-7.0.5.tar.gz:

Publisher: python-publish.yml on toluaina/pgsync

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pgsync-7.0.5-py3-none-any.whl.

File metadata

  • Download URL: pgsync-7.0.5-py3-none-any.whl
  • Upload date:
  • Size: 79.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pgsync-7.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 e255f5bd67f3212e48666ccef9bc53240d144d519a28ee3f2a69dcaa8a1a3e84
MD5 a924cf632ce6b89a9a47a4427ba2d039
BLAKE2b-256 ab4274d084113238cb327429dfe86dfed8d4fbb3d268ae8c2e81c7d7902b03a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for pgsync-7.0.5-py3-none-any.whl:

Publisher: python-publish.yml on toluaina/pgsync

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page