Skip to main content

Postgres to elasticsearch sync

Project description

PostgreSQL to Elasticsearch sync

PGSync <https://pgsync.com>_ is a middleware for syncing data from Postgres <https://www.postgresql.org>_ to Elasticsearch <https://www.elastic.co/products/elastic-stack>.
It allows you to keep Postgres <https://www.postgresql.org>
as your source of truth data source and expose structured denormalized documents in Elasticsearch <https://www.elastic.co/products/elastic-stack>_.

Requirements

  • Python <https://www.python.org>_ 3.6+
  • Postgres <https://www.postgresql.org>_ 9.4+
  • Redis <https://redis.io>_
  • Elasticsearch <https://www.elastic.co/products/elastic-stack>_ 6.3.1+

Postgres setup

Enable logical decoding <https://www.postgresql.org/docs/current/logicaldecoding.html>_ in your Postgres setting.

  • you would also need to set up two parameters in your Postgres config postgresql.conf

    wal_level = logical

    max_replication_slots = 1

Installation

You can install PGSync from PyPI <https://pypi.org>_:

$ pip install pgsync

Config

Create a schema for the application named e.g schema.json

Example schema <https://github.com/toluaina/pg-sync/blob/master/examples/airbnb/schema.json>_

Example spec

.. code-block::

[
    {
        "database": "[database name]",
        "index": "[elasticsearch index]",
        "nodes": [
            {
                "table": "[table A]",
                "schema": "[table A schema]",
                "columns": [
                    "column 1 from table A",
                    "column 2 from table A",
                    ... additional columns
                ],
                "children": [
                    {
                        "table": "[table B with relationship to table A]",
                        "schema": "[table B schema]",
                        "columns": [
                          "column 1 from table B",
                          "column 2 from table B",
                          ... additional columns
                        ],
                        "relationship": {
                            "variant": "object",
                            "type": "one_to_many"
                        },
                        ...
                    },
                    {
                        ... any other additional children
                    }
                ]
            }
        ]
    }
]

Environment variables

Setup required environment variables for the application

SCHEMA='/path/to/schema.json'

ELASTICSEARCH_HOST=localhost
ELASTICSEARCH_PORT=9200

PG_HOST=localhost
PG_USER=i-am-root # this must be a postgres superuser
PG_PORT=5432
PG_PASSWORD=*****

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_AUTH=*****

Running

bootstrap the database (one time only) $ bootstrap --config schema.json run pgsync as a daemon $ pgsync --config schema.json --daemon

======= History

1.0.1 (2020-15-01)

  • First release on PyPI.

1.0.1 (2020-01-01)

  • RC1 release

1.1.0 (2020-04-13)

  • Postgres multi schema support for multi-tennant applications

  • Show resulting Query with verbose mode

  • this release required you to re-bootstrap your database with

    • bootstrap -t
    • bootstrap

1.1.1 (2020-05-18)

  • Fixed authentication with Redis
  • Fixed Docker build

1.1.2 (2020-06-11)

  • Sync multiple indices in the same schema
  • Test for replication or superuser
  • Fix PG_NOTIFY error when payload exceeds 8000 bytes limit

1.1.3 (2020-06-14)

  • Bug fix when syncing multiple indices in the same schema

1.1.4 (2020-06-15)

  • Only create triggers for tables present in schema

1.1.5 (2020-06-16)

  • Bug fix when creating multiple triggers in same schema

1.1.6 (2020-07-31)

  • Bug fix when tearing down secondary schema

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pgsync-1.1.6-cp38-cp38-manylinux2010_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.12+ x86-64

pgsync-1.1.6-cp38-cp38-manylinux1_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8

pgsync-1.1.6-cp37-cp37m-manylinux2010_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.12+ x86-64

pgsync-1.1.6-cp37-cp37m-manylinux1_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.7m

pgsync-1.1.6-cp37-cp37m-macosx_10_15_x86_64.whl (873.7 kB view details)

Uploaded CPython 3.7mmacOS 10.15+ x86-64

pgsync-1.1.6-cp36-cp36m-manylinux2010_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.12+ x86-64

pgsync-1.1.6-cp36-cp36m-manylinux1_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.6m

File details

Details for the file pgsync-1.1.6-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6753b0c035db79bf390470bacd2ef52606ec0758e242243829e4bd0a2c704d3c
MD5 1d401cb3fa1834389e709d0f43eba79c
BLAKE2b-256 e837667c8e1d4b55eb0f3feedf07e3bbfd7a7b0fe49e880ad85b9f49e28f0949

See more details on using hashes here.

File details

Details for the file pgsync-1.1.6-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 db030bc90a2b232b8223399f240f4d74150cd2ca75069bac3d2b4e18de45e9a0
MD5 1393451ff001994a4e7b60713c9c21a4
BLAKE2b-256 7137804a92509c54ea8510948f733a79b3db88be3e92520bcb54448812edf819

See more details on using hashes here.

File details

Details for the file pgsync-1.1.6-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 886e4cfe79769623d39bba71a964fb28f92131ce1ddc986f008661c4328a3e00
MD5 5b3733b0af129730a6083b9909d81410
BLAKE2b-256 050166f0b54dc20a145f1adffa8b9af16a1173bcf3b4a86b7654d66deb0e916f

See more details on using hashes here.

File details

Details for the file pgsync-1.1.6-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2f4d0efe9b0aead3a54d461e7892ff7d308c8f32533d27c06c22914675a94192
MD5 5a01ee468c0522821026615b214625bc
BLAKE2b-256 dee9b617ab8b52f15fa03939f6f3824a565f40177fef3a43bf6a79675ace0064

See more details on using hashes here.

File details

Details for the file pgsync-1.1.6-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 873.7 kB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 f76931447286920b4ce92d3d8f056ac0f925029b19d87f409a061a628a65d900
MD5 827e67713fe7b2b8aa6436183fbe7d9b
BLAKE2b-256 988e070d58e3afd1b4564026e40bc4c76b0053e6a03d79f87e9b52331cf20732

See more details on using hashes here.

File details

Details for the file pgsync-1.1.6-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5d768976e33baf56da63ae18b7798d2c575d9aacda6ac3d701466c6e59d75ca2
MD5 89e9f7f364e56e132d93f0dc2cdba506
BLAKE2b-256 c626fbf95c3a0f61220a4889c1bff46a24e3be077193d61e25c5a404f7f29bd6

See more details on using hashes here.

File details

Details for the file pgsync-1.1.6-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pgsync-1.1.6-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for pgsync-1.1.6-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 348f1dfc240ec396acd0a9aa42dd927b40bdfcbe574ef65892d9a721cb58ed31
MD5 172d5e6152f31248ca0045a239a3cd3b
BLAKE2b-256 c36b917a84cc7c9eae95a12cac5578eae1dc7e77986fd4d3a1c06c4c21077dd7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page