Skip to main content

Postgres to elasticsearch sync

Project description

PostgreSQL to Elasticsearch sync

PGSync <https://pgsync.com>_ is a middleware for syncing data from Postgres <https://www.postgresql.org>_ to Elasticsearch <https://www.elastic.co/products/elastic-stack>.
It allows you to keep Postgres <https://www.postgresql.org>
as your source of truth data source and expose structured denormalized documents in Elasticsearch <https://www.elastic.co/products/elastic-stack>_.

Requirements

  • Python <https://www.python.org>_ 3.6+
  • Postgres <https://www.postgresql.org>_ 9.4+
  • Redis <https://redis.io>_
  • Elasticsearch <https://www.elastic.co/products/elastic-stack>_ 6.3.1+

Postgres setup

Enable logical decoding <https://www.postgresql.org/docs/current/logicaldecoding.html>_ in your Postgres setting.

  • you would also need to set up two parameters in your Postgres config postgresql.conf

    wal_level = logical

    max_replication_slots = 1

Installation

You can install PGSync from PyPI <https://pypi.org>_:

$ pip install pgsync

Config

Create a schema for the application named e.g schema.json

Example schema <https://github.com/toluaina/pg-sync/blob/master/examples/airbnb/schema.json>_

Example spec

.. code-block::

[
    {
        "database": "[database name]",
        "index": "[elasticsearch index]",
        "nodes": [
            {
                "table": "[table A]",
                "schema": "[table A schema]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "schema": "[table B schema]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

Environment variables

Setup required environment variables for the application

SCHEMA='/path/to/schema.json'

ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

PG_HOST=localhost
PG_USER=i-am-root # this must be a postgres superuser
PG_PORT=5432
PG_PASSWORD=*****

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_AUTH=*****

Running

bootstrap the database (one time only) $ bootstrap --config schema.json run pgsync as a daemon $ pgsync --config schema.json --daemon

======= History

1.0.1 (2020-15-01)

  • First release on PyPI.

1.0.1 (2020-01-01)

  • RC1 release

1.1.0 (2020-04-13)

  • Postgres multi schema support for multi-tennant applications

  • Show resulting Query with verbose mode

  • this release required you to re-bootstrap your database with

    • bootstrap -t
    • bootstrap

1.1.1 (2020-05-18)

  • Fixed authentication with Redis
  • Fixed Docker build

1.1.2 (2020-06-11)

  • Sync multiple indices in the same schema
  • Test for replication or superuser
  • Fix PG_NOTIFY error when payload exceeds 8000 bytes limit

1.1.3 (2020-06-14)

  • Bug fix when syncing multiple indices in the same schema

1.1.4 (2020-06-15)

  • Only create triggers for tables present in schema

1.1.5 (2020-06-16)

  • Bug fix when creating multiple triggers in same schema

1.1.6 (2020-07-31)

  • Bug fix when tearing down secondary schema

1.1.7 (2020-08-16)

  • Fix issue #29: SQLAlchemy err: Neither 'BooleanClauseList' object nor 'Comparator' object has an attribute '_orig'

1.1.8 (2020-08-19)

  • Fix issue #30: Traceback AttributeError: id

1.1.9 (2020-08-26)

  • Fix issue #33: Unable to set Redis port via environment variable.

1.1.10 (2020-08-29)

  • Support Amazon RDS #16
  • Optimize database reflection on startup
  • Show elapsed time

1.1.11 (2020-09-07)

  • Support specify Elasticsearch field data type

1.1.12 (2020-09-08)

  • Add support for SSL TCP/IP connection mode

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.manylinux1_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pgsync-1.1.12-cp38-cp38-manylinux1_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.8

pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pgsync-1.1.12-cp37-cp37m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m

pgsync-1.1.12-cp37-cp37m-macosx_10_15_x86_64.whl (827.1 kB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pgsync-1.1.12-cp36-cp36m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m

pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

File details

Details for the file pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 64a2aad98e6f96b992392d75c1678e1c77bb818694517e9c2adf6ab8370829bf
MD5 5f7e4478a5d7fb563adbedf6dc274c3b
BLAKE2b-256 18188ae272554d6b126cc2f2e05262ca5dca1780683f2f791a5af760fd60da11

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.12-cp38-cp38-manylinux2010_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5486702663c59d50fd3f6cc5e5a9e9425ce227a33bdc8ef1aa884dc947f776cd
MD5 31cbebca1e8dc29e85849a3fbd1698d8
BLAKE2b-256 fa65f89ec8b49a4e3ff95ce22cb85dd57ff0e278ac6c229c11949e5dfa40e3d7

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 000e8c45ebc36ee3e2e56ddb0af6cd6436c8d6737c6b9e6741368a1296776700
MD5 2cad3ce65696f27d7e219552f9d1728c
BLAKE2b-256 5f8f1d7a761deaae99313250213d5a59a9a71c5d5f1217da16a504beaa0b1cd4

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 79cf7709e73afd0c62a28b0184b989ed4c1349e90424a7668234d3543c23a13f
MD5 10bffa5ed0f87583ecfcb54dc3e27358
BLAKE2b-256 ec3ec1b1d07df12902895e35452b750a71b4411989ef51dc0f4bb5e7d636d672

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.12-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a3aef66493f2f541480879616fe3b07b8916a6dbed872775a55291282ace4c26
MD5 09b3b02879b5f27bc6fa53f12dc12371
BLAKE2b-256 458761bd6ea82f3f6baba382c71c632f875852e28afb74bdcaa8acd743b8fa67

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 daae757b053e4ce4d94a54cac38983612c3990bf0156325691686aac299077b0
MD5 b75af3aa92320c8881888bd24b05b595
BLAKE2b-256 00a60579c0d3b0d91cf0297d517c3c2d36674840ed532237008cbe5b8e771037

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 827.1 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 2605288386bf35d720bfa8e312937e76753692e80f84465bf73ee3c023d324e5
MD5 6f8ff061be9b3bfcf4c94e1158a7b2db
BLAKE2b-256 152ec09b43b6bd7a9ddb66758ddba49a82ab682b1640b0879deeae49b5565094

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9cfa328a19f050524019678d7e97c7a0f4b71ea70c6a65154260b58894e5862f
MD5 c90bc7c73d10b1a0b549aebf05349c3a
BLAKE2b-256 d9a529bdf432b0c4bad2118a5b2905a9affddf59a2363fdd2a34701877562395

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5e7ad25816be7ed29dcc8354d8e4aa3a2f61bff98f39d45f5101076f8080edff
MD5 368ef6f0b8503491ab7be7e3c9ba0761
BLAKE2b-256 49f1436f70bf364bae51535579a88dcfcf7727aeb970b5ba001229a9ef70cae3

See more details on using hashes here.

File details

Details for the file pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.12-cp36-cp36m-manylinux1_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c426475bade14445a510b206f62744b2dbdb2028336a8e3933d704c1ebd0492e
MD5 165582ff212e0b0eb8db6aca4dca6594
BLAKE2b-256 7640f7b8efeba5b6c94ba2a139ddeeedc8d8d297c2dd9afd8f0eedef14f67695

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page