Skip to main content

Postgres to elasticsearch sync

Project description

# PostgreSQL to Elasticsearch sync

PGSync is a middleware for syncing data from Postgres to Elasticsearch. It allows you to keep Postgres as your source of truth data source and expose structured denormalized documents in Elasticsearch.

### Requirements

### Postgres setup

Enable logical decoding in your Postgres setting.

  • you would also need to set up at least two parameters at postgresql.conf

    `wal_level = logical`

    `max_replication_slots = 1`

### Installation

You can install PGSync from PyPI:

$ pip install pgsync

### Config

Create a schema for the application named e.g schema.json

Example schema

Example spec

[
    {
        "index": "[database name]",
        "nodes": [
            {
                "table": "[table A]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

### Environment variables

Setup required environment variables for the application

SCHEMA=’/path/to/schema.json’

ELASTICSEARCH_HOST=localhost ELASTICSEARCH_PORT=9200

PG_HOST=localhost PG_USER=i-am-root # this must be a postgres superuser PG_PORT=5432 PG_PASSWORD=*****

REDIS_HOST=redis REDIS_PORT=6379 REDIS_DB=0 REDIS_AUTH=*****

## Running

bootstrap the database (one time only)

$ bootstrap –config schema.json

run pgsync as a daemon

$ pgsync –config schema.json –daemon

History

1.0.1 (2020-15-01)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page