Skip to main content

Postgres to elasticsearch sync

Project description

PostgreSQL to Elasticsearch sync

PGSync <https://pgsync.com>_ is a middleware for syncing data from Postgres <https://www.postgresql.org>_ to Elasticsearch <https://www.elastic.co/products/elastic-stack>.
It allows you to keep Postgres <https://www.postgresql.org>
as your source of truth data source and expose structured denormalized documents in Elasticsearch <https://www.elastic.co/products/elastic-stack>_.

Requirements

  • Python <https://www.python.org>_ 3.6+
  • Postgres <https://www.postgresql.org>_ 9.4+
  • Redis <https://redis.io>_
  • Elasticsearch <https://www.elastic.co/products/elastic-stack>_ 6.3.1+

Postgres setup

Enable logical decoding <https://www.postgresql.org/docs/current/logicaldecoding.html>_ in your Postgres setting.

  • You would also need to set up two parameters in your Postgres config postgresql.conf

    wal_level = logical

    max_replication_slots = 1

Installation

You can install PGSync from PyPI <https://pypi.org>_:

$ pip install pgsync

Config

Create a schema for the application named e.g schema.json

Example schema <https://github.com/toluaina/pgsync/blob/master/examples/airbnb/schema.json>_

Example spec

.. code-block::

[
    {
        "database": "[database name]",
        "index": "[elasticsearch index]",
        "nodes": [
            {
                "table": "[table A]",
                "schema": "[table A schema]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "schema": "[table B schema]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

Environment variables

Setup required environment variables for the application

SCHEMA='/path/to/schema.json'

ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

PG_HOST=localhost
PG_USER=i-am-root # this must be a postgres superuser
PG_PORT=5432
PG_PASSWORD=*****

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_AUTH=*****

Running

Bootstrap the database (one time only) $ bootstrap --config schema.json Run pgsync as a daemon $ pgsync --config schema.json --daemon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgsync-1.1.27.tar.gz (74.9 kB view details)

Uploaded Source

Built Distribution

pgsync-1.1.27-py2.py3-none-any.whl (37.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pgsync-1.1.27.tar.gz.

File metadata

  • Download URL: pgsync-1.1.27.tar.gz
  • Upload date:
  • Size: 74.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.1

File hashes

Hashes for pgsync-1.1.27.tar.gz
Algorithm Hash digest
SHA256 27958d0a05c9dc3838199e1a04ff3f1a05ea1d61cb6b11d2f636260d00473f5a
MD5 9e7b0e3e9f46ce84c183e3d55ac20d9d
BLAKE2b-256 3f3c4a3d83504b7ad81be3dbd856b5cf463b29990ee21d1f810826a06524c747

See more details on using hashes here.

File details

Details for the file pgsync-1.1.27-py2.py3-none-any.whl.

File metadata

  • Download URL: pgsync-1.1.27-py2.py3-none-any.whl
  • Upload date:
  • Size: 37.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.1

File hashes

Hashes for pgsync-1.1.27-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 00d89604eb4f7b49089cc048e6cb6afacf3e2cf276460d6d1af6be73e29faa27
MD5 4ccb16eea861782de580dc88be5c84c0
BLAKE2b-256 15dd8269e71deee8988b24446e59a44fec01f5ddedd1cd163f5298088c9abe7c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page