Skip to main content

Postgres to elasticsearch sync

Project description

PostgreSQL to Elasticsearch sync

PGSync <https://pgsync.com>_ is a middleware for syncing data from Postgres <https://www.postgresql.org>_ to Elasticsearch <https://www.elastic.co/products/elastic-stack>.
It allows you to keep Postgres <https://www.postgresql.org>
as your source of truth data source and expose structured denormalized documents in Elasticsearch <https://www.elastic.co/products/elastic-stack>_.

Requirements

  • Python <https://www.python.org>_ 3.6+
  • Postgres <https://www.postgresql.org>_ 9.4+
  • Redis <https://redis.io>_
  • Elasticsearch <https://www.elastic.co/products/elastic-stack>_ 6.3.1+

Postgres setup

Enable logical decoding <https://www.postgresql.org/docs/current/logicaldecoding.html>_ in your Postgres setting.

  • You would also need to set up two parameters in your Postgres config postgresql.conf

    wal_level = logical

    max_replication_slots = 1

Installation

You can install PGSync from PyPI <https://pypi.org>_:

$ pip install pgsync

Config

Create a schema for the application named e.g schema.json

Example schema <https://github.com/toluaina/pgsync/blob/master/examples/airbnb/schema.json>_

Example spec

.. code-block::

[
    {
        "database": "[database name]",
        "index": "[elasticsearch index]",
        "nodes": [
            {
                "table": "[table A]",
                "schema": "[table A schema]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "schema": "[table B schema]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

Environment variables

Setup required environment variables for the application

SCHEMA='/path/to/schema.json'

ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

PG_HOST=localhost
PG_USER=i-am-root # this must be a postgres superuser
PG_PORT=5432
PG_PASSWORD=*****

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_AUTH=*****

Running

Bootstrap the database (one time only) $ bootstrap --config schema.json Run pgsync as a daemon $ pgsync --config schema.json --daemon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgsync-1.1.31.tar.gz (78.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgsync-1.1.31-py2.py3-none-any.whl (40.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file pgsync-1.1.31.tar.gz.

File metadata

  • Download URL: pgsync-1.1.31.tar.gz
  • Upload date:
  • Size: 78.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for pgsync-1.1.31.tar.gz
Algorithm Hash digest
SHA256 8e827f55dfb94bec9ab7be04716b0f954068d57b45fa7dc8c8cdfc411b9d2dd8
MD5 e88500b9e8f7050d3e14db36a9dd28e8
BLAKE2b-256 6ec3507b9bf9585a1eef36fdab2a670bddf3579f67638381e0de5c62c4f53158

See more details on using hashes here.

File details

Details for the file pgsync-1.1.31-py2.py3-none-any.whl.

File metadata

  • Download URL: pgsync-1.1.31-py2.py3-none-any.whl
  • Upload date:
  • Size: 40.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for pgsync-1.1.31-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d1dfebd0df3ce1404d6669aa033bddd5b0f28fb4279ef93568138a147ae9d8de
MD5 0013044f1326db5895e1203e8e307ca9
BLAKE2b-256 928a556e88228ea5f2151657105cd82ea89ffd3df6a2ed7a3e0a5a55355f84d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page