Skip to main content

Postgres to elasticsearch sync

Project description

PostgreSQL to Elasticsearch sync

PGSync <https://pgsync.com>_ is a middleware for syncing data from Postgres <https://www.postgresql.org>_ to Elasticsearch <https://www.elastic.co/products/elastic-stack>.
It allows you to keep Postgres <https://www.postgresql.org>
as your source of truth data source and expose structured denormalized documents in Elasticsearch <https://www.elastic.co/products/elastic-stack>_.

Requirements

  • Python <https://www.python.org>_ 3.6+
  • Postgres <https://www.postgresql.org>_ 9.4+
  • Redis <https://redis.io>_
  • Elasticsearch <https://www.elastic.co/products/elastic-stack>_ 6.3.1+

Postgres setup

Enable logical decoding <https://www.postgresql.org/docs/current/logicaldecoding.html>_ in your Postgres setting.

  • you would also need to set up two parameters in your Postgres config postgresql.conf

    wal_level = logical

    max_replication_slots = 1

Installation

You can install PGSync from PyPI <https://pypi.org>_:

$ pip install pgsync

Config

Create a schema for the application named e.g schema.json

Example schema <https://github.com/toluaina/pg-sync/blob/master/examples/airbnb/schema.json>_

Example spec

.. code-block::

[
    {
        "database": "[database name]",
        "index": "[elasticsearch index]",
        "nodes": [
            {
                "table": "[table A]",
                "schema": "[table A schema]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "schema": "[table B schema]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

Environment variables

Setup required environment variables for the application

SCHEMA='/path/to/schema.json'

ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

PG_HOST=localhost
PG_USER=i-am-root # this must be a postgres superuser
PG_PORT=5432
PG_PASSWORD=*****

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_AUTH=*****

Running

bootstrap the database (one time only) $ bootstrap --config schema.json run pgsync as a daemon $ pgsync --config schema.json --daemon

======= History

1.0.1 (2020-15-01)

  • First release on PyPI.

1.0.1 (2020-01-01)

  • RC1 release

1.1.0 (2020-04-13)

  • Postgres multi schema support for multi-tennant applications

  • Show resulting Query with verbose mode

  • this release required you to re-bootstrap your database with

    • bootstrap -t
    • bootstrap

1.1.1 (2020-05-18)

  • Fixed authentication with Redis
  • Fixed Docker build

1.1.2 (2020-06-11)

  • Sync multiple indices in the same schema
  • Test for replication or superuser
  • Fix PG_NOTIFY error when payload exceeds 8000 bytes limit

1.1.3 (2020-06-14)

  • Bug fix when syncing multiple indices in the same schema

1.1.4 (2020-06-15)

  • Only create triggers for tables present in schema

1.1.5 (2020-06-16)

  • Bug fix when creating multiple triggers in same schema

1.1.6 (2020-07-31)

  • Bug fix when tearing down secondary schema

1.1.7 (2020-08-16)

  • Fix issue #29: SQLAlchemy err: Neither 'BooleanClauseList' object nor 'Comparator' object has an attribute '_orig'

1.1.8 (2020-08-19)

  • Fix issue #30: Traceback AttributeError: id

1.1.9 (2020-08-26)

  • Fix issue #33: Unable to set Redis port via environment variable.

1.1.10 (2020-08-29)

  • Support Amazon RDS #16
  • Optimize database reflection on startup
  • Show elapsed time

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pgsync-1.1.10-cp38-cp38-manylinux2010_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8

pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pgsync-1.1.10-cp37-cp37m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m

pgsync-1.1.10-cp37-cp37m-macosx_10_15_x86_64.whl (807.9 kB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pgsync-1.1.10-cp36-cp36m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m

File details

Details for the file pgsync-1.1.10-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b76b24a986030417d5176774d01a26c93e1014c0cd8bb4b76468413d3a66f6a3
MD5 c0a31c4a0128f397c3b79d0ec6f9ecb0
BLAKE2b-256 38ab1ef0d4d82810eeca366a968aa81b88727205cde74a2e26ad960eaa4b7ee8

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ab87af925bac424afae2bdc1c3d1ff432a6f74b35308a9b1cf3478671dd62587
MD5 cf836b87ac16f2079882f0e117b37f77
BLAKE2b-256 ddef4700ba19b7c3bdafbac27ac56745c25a67195ece8b2242c498e88d569101

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.10-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c154986bbc6334ebb0f9f86a6b3ce32f0b138bf887b7b793c7d7030b19369e1e
MD5 5e680e47773d8c5621eea8f6f312513f
BLAKE2b-256 400de6b5018f2df39afe740aca9962b6b2cf0036f13dfb065ad752ac29fb9ec7

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2f44dabcb8775dc5dd6020e36fe32485f8bd480b337275e8e03056a4b732f1e5
MD5 712b53f8b3d10928f1402aadf15be572
BLAKE2b-256 0cb8cd0e66b4825267329a73444d8ca3fe75cd0086a8d39f57181c187503aea3

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.10-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2699d8dc7c4b765675f7bfe76a831e29e994e752453397f60fb70ed6e2d290ab
MD5 bf934dc00a201bafa2faec228b194bd9
BLAKE2b-256 c36ca213138f1dfcdc0e0cc5dbe4977630f2e84d0a61f39c3233e1eea734d62d

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0b60eea1ddb8d321d12a16caeb628eb1d85ef0092174d3943205f7d82fda449c
MD5 4b032f9cf5ed547fefa183117d3d1a02
BLAKE2b-256 20aa9ce4167e5ad20a70f10096208d06aa851fa8a957e704ac09ca960213e20f

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 807.9 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 9ec615e5f3096be85e1e4fb2c1181747ad3991f9bc86e0846ba4b8fb60dfd400
MD5 514e0ff87511ae8cd832cba1b5b3f160
BLAKE2b-256 0df8bf699fd0984909ed0e4ad2925729f332961d5a84b9a4aca465843a3e012b

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1d817a1505b572339bf4f001db83756a068a8f4d41bbd645fb961bb425a76888
MD5 12f5a20e1050b4e93207be9cb77c3d74
BLAKE2b-256 d29a1345fdd85690ce006531aa874f14930a4f664bef6d6d0563d7ae04278785

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.10-cp36-cp36m-manylinux2010_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7ff1de1b5b9ce5a941aa8c06b67d56ab4c805a85c54c63d63f096bc878056a6b
MD5 bacc0ecf585c0076c27b517a2dc567de
BLAKE2b-256 fc7fe121739a5fb89963edebd766fd9f0e010665fa7649a6fae2d29509a95978

See more details on using hashes here.

File details

Details for the file pgsync-1.1.10-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.10-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.10-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0823b65ffd1eae95bb475ee12bc4a49460ff3b267bbae8c6b4ed540f236ed8e9
MD5 142c699dafd8ffe60ab613b759df2593
BLAKE2b-256 2a27aefd5df629100b2565d0031272c1900d088b13a4566cfa58a207b053c932

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page