Skip to main content

Postgres to elasticsearch sync

Project description

PostgreSQL to Elasticsearch sync

PGSync <https://pgsync.com>_ is a middleware for syncing data from Postgres <https://www.postgresql.org>_ to Elasticsearch <https://www.elastic.co/products/elastic-stack>.
It allows you to keep Postgres <https://www.postgresql.org>
as your source of truth data source and expose structured denormalized documents in Elasticsearch <https://www.elastic.co/products/elastic-stack>_.

Requirements

  • Python <https://www.python.org>_ 3.6+
  • Postgres <https://www.postgresql.org>_ 9.4+
  • Redis <https://redis.io>_
  • Elasticsearch <https://www.elastic.co/products/elastic-stack>_ 6.3.1+

Postgres setup

Enable logical decoding <https://www.postgresql.org/docs/current/logicaldecoding.html>_ in your Postgres setting.

  • you would also need to set up two parameters in your Postgres config postgresql.conf

    wal_level = logical

    max_replication_slots = 1

Installation

You can install PGSync from PyPI <https://pypi.org>_:

$ pip install pgsync

Config

Create a schema for the application named e.g schema.json

Example schema <https://github.com/toluaina/pg-sync/blob/master/examples/airbnb/schema.json>_

Example spec

.. code-block::

[
    {
        "database": "[database name]",
        "index": "[elasticsearch index]",
        "nodes": [
            {
                "table": "[table A]",
                "schema": "[table A schema]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "schema": "[table B schema]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

Environment variables

Setup required environment variables for the application

SCHEMA='/path/to/schema.json'

ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

PG_HOST=localhost
PG_USER=i-am-root # this must be a postgres superuser
PG_PORT=5432
PG_PASSWORD=*****

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_AUTH=*****

Running

bootstrap the database (one time only) $ bootstrap --config schema.json run pgsync as a daemon $ pgsync --config schema.json --daemon

======= History

1.0.1 (2020-15-01)

  • First release on PyPI.

1.0.1 (2020-01-01)

  • RC1 release

1.1.0 (2020-04-13)

  • Postgres multi schema support for multi-tennant applications

  • Show resulting Query with verbose mode

  • this release required you to re-bootstrap your database with

    • bootstrap -t
    • bootstrap

1.1.1 (2020-05-18)

  • Fixed authentication with Redis
  • Fixed Docker build

1.1.2 (2020-06-11)

  • Sync multiple indices in the same schema
  • Test for replication or superuser
  • Fix PG_NOTIFY error when payload exceeds 8000 bytes limit

1.1.3 (2020-06-14)

  • Bug fix when syncing multiple indices in the same schema

1.1.4 (2020-06-15)

  • Only create triggers for tables present in schema

1.1.5 (2020-06-16)

  • Bug fix when creating multiple triggers in same schema

1.1.6 (2020-07-31)

  • Bug fix when tearing down secondary schema

1.1.7 (2020-08-16)

  • Fix issue #29: SQLAlchemy err: Neither 'BooleanClauseList' object nor 'Comparator' object has an attribute '_orig'

1.1.8 (2020-08-19)

  • Fix issue #30: Traceback AttributeError: id

1.1.9 (2020-08-26)

  • Fix issue #33: Unable to set Redis port via environment variable.

1.1.10 (2020-08-29)

  • Support Amazon RDS #16
  • Optimize database reflection on startup
  • Show elapsed time

1.1.11 (2020-09-08)

  • Support specify Elasticsearch field data type

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pgsync-1.1.11-cp38-cp38-manylinux2010_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8

pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pgsync-1.1.11-cp37-cp37m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m

pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pgsync-1.1.11-cp37-cp37m-macosx_10_15_x86_64.whl (821.6 kB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pgsync-1.1.11-cp36-cp36m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.6m

File details

Details for the file pgsync-1.1.11-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f9d2a2739ea7b336d6098bedd15e30cb0699eac4032e355876e969d47e32009b
MD5 d2e12d3d5e8d27a0de1c07301b1aab18
BLAKE2b-256 61d5db27de793ff1120c2514b8f6d0e04acb041f66a87afaca6512c82bb57dac

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8d2685dc5e963f4bf611d3367a92216ae05858530b44f1bbf2d6b417e505e0cf
MD5 fc9e29d784fadd74a27dbd7ae3d7f0a1
BLAKE2b-256 932ac28b45266075a04695127977a611002b1b6ddd586ea45a776f4929b12185

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.11-cp38-cp38-manylinux1_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e198586a250759162cbdc77c264114f62d6d4de2cb8c69b680ea821330686e6e
MD5 c5ca345e5d30b8efe8786b6db1975900
BLAKE2b-256 65b4ef923621efbf35a0973493cbf7e00fbe01c0cc3b601f98e2becabf4884ac

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 329d4a2bcef353b4deb2b9bd6c6f285c57efdcf77d55c3666954ba52e935e491
MD5 bda5cf0de44cb797819c77561cc1bbcf
BLAKE2b-256 c089f5e72ebe7b7430bba37de04b87773c118fd977ed45ec8982227f500b2a1b

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 445439dab82581e83fc475de35f4bc8e73a350203568566ddd8da2cf5f5d4d23
MD5 74ed26344a4e6e547bc7f521e7e5ba04
BLAKE2b-256 bedca69cbb69214b7fe1b5964585953fcc437b1ec194844b0b8123afba6f4706

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.11-cp37-cp37m-manylinux1_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fa3049174b4ef1abf65117bee4af628337ee673cde6666532fafc43b1e90c0a6
MD5 d39bad2f467b5f51fd25cb6b39161308
BLAKE2b-256 8ca3ef3720e78490ed0c4445d3a30d1437417b76994b37ad06be3d240458225b

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 821.6 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 2984970e9d75a1ffe23093085c46dd60e5d0e8b487467164f140559973ec1915
MD5 fffe85a2f02428368cb6a9c408394a4d
BLAKE2b-256 950fa04acdc6ccb2b5da6b0028e3547a871e126bf63d9646f253d9e9b3837739

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a18e283311c99c1c0fe0058e332c2e169c7ec525c7de264c0bc5d3508dce32a1
MD5 6218fa243eef6b0f24ebcd9506f91a28
BLAKE2b-256 a8e2850aa51d5a42b89d0513bae5f72cc1b4500e0440a1ec547577e9a71bec20

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for pgsync-1.1.11-cp36-cp36m-manylinux2010_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c1146267b15fb42c4125e6aaf324456aef5e081145c20bc8e644ba1ea5b9020c
MD5 c2920188dec4f440ac529c4155ffdcac
BLAKE2b-256 2c8647d06e4c171fdc55b1770a77a9dc7a534901b96db93b232afc8047b82bb9

See more details on using hashes here.

File details

Details for the file pgsync-1.1.11-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.11-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.11-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6c954e10334f52aa7c25c6b875e5d13e9ee7ad3cee2cd6988063a385a9d467f2
MD5 8061e9f5d22afb356f755b9fbd406c4d
BLAKE2b-256 d0092ddc478c7aa767edfeda143fca0c34ff34e0630b66b2fb7c7114f9c61af8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page