Enhanced PostgreSQL to Elasticsearch Data Synchronization
Project description
⚡ pg2elastic
Enhanced PostgreSQL to Elasticsearch Data Synchronization
📚 Description
Welcome to pg2elastic, a fork of the official pgsync package, designed to provide seamless and efficient data synchronization between PostgreSQL databases and Elasticsearch clusters. Building upon the solid foundation of pgsync, pg2elastic inherits all of its powerful capabilities and takes them a step further.
Key Features:
- High-Performance Sync: pg2elastic inherits the robust data synchronization engine from pgsync, ensuring lightning-fast and reliable transfers.
- Real-time Indexing: Seamlessly mirror your PostgreSQL data into Elasticsearch indices, keeping them in sync in real-time.
- Schema Mapping: Easily define and customize the mapping of PostgreSQL schemas to Elasticsearch indexes, giving you full control over the data structure.
- Efficient Data Types Handling: pg2elastic effortlessly handles data type conversions, ensuring accurate representation across platforms.
- Continuous Enhancements: We are committed to actively maintaining and enhancing pg2elastic, incorporating the latest advancements in both PostgreSQL and Elasticsearch technologies.
- Whether you're working on a data-driven application or performing complex data analysis, pg2elastic empowers you with a streamlined and feature-rich solution for harmonizing your PostgreSQL and Elasticsearch ecosystems.
🛠️ Prerequisites
✨ Key Enhancements
Fixes
The actual numerical value of 1000000 becomes 1e when attempted to be converted to a float, leading to a crash during the conversion process.
- A fix was implemented to change 1e to 1e6, preventing the conversion to float from causing the process to crash.
The process crashes when custom field types are use in the database, and the returned field type is in the format abc.xyz.
- A fix was implemented, and the regular expression for LOGICAL_SLOT_SUFFIX was modified.
The process crashes when a partition notification is received
- A fix was implemented, and partitions are tracked in lib's materialized, lib's trigger is updated to include partition's parent table
Child records that are inserted are not updating the parent document
- A fix was implemented, which synchronizes main document in case a child record is created
Environment Variables
PG_SCHEMA
- Environment variable to enhance performance by eliminating the need to scan all schemas
REDIS_USERNAME
- Environment variable to specify redis username
REDIS_PASSWORD
- Environment variable to specify redis password
REDIS_ENDPOINT
- Environment variable to specify redis connection endpoint, defaults to
localhost
REDIS_SSL
- Environment variable to specify if redis connection should use ssl, defaults to
true
REDIS_CLUSTER
- Environment variable to specify if redis connection is clustered, defaults to
true
REDIS_CHECKPOINT
- Environment variable to specify if redis will be used to save restore checkpoints, defaults to
true
SKIP_BOOTSTRAP
- Environment variable to specify if boostrap command should be skipped, defaults to
true
. - Use this env variable if bootstrap command was already run, and you have your bootstrap command stuck in a shell script.
- Set to false in cause there are new indexes or schema changes
🚀 Deployment
Manual Deployment
How to run pg2elastic
and initialize it.
-
Create a .env file using the
cp .env.sample .env
command and replace the existing environment variables with personal configuration settings. -
Download dependencies using
python setup.py develop
-
Start the app by using
pg2elastic
file command from bin folder, usingpython3 pg2elastic --schema yourschema.json
If you do not run the full setup, you will get errors when running this package.
✅ Testing
$ export PG_SCHEMA=
$ flake8 pg2elastic tests
$ python setup.py test
🔊 Logs
This project comes with a loguru module for logging, the configurations
for loguru can be found in pg2elastic
file from bin folder.
🚚 Deployment
$ python setup.py sdist bdist_wheel
$ twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pg2elastic-0.3.6.tar.gz
.
File metadata
- Download URL: pg2elastic-0.3.6.tar.gz
- Upload date:
- Size: 113.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a61347e1675503e4cf1cee95e9b197fa2f4323f39cb55a11c474f21f2884952 |
|
MD5 | ca2cb68bcf9b672206395aa5e57df956 |
|
BLAKE2b-256 | 8fa7c186370bfb8886786624489d5b104d02d21be1da65640fe05427fe94628b |
File details
Details for the file pg2elastic-0.3.6-py3-none-any.whl
.
File metadata
- Download URL: pg2elastic-0.3.6-py3-none-any.whl
- Upload date:
- Size: 67.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f738e0d4b43cedbb3ffb07900f6b7e7d24d22968dd8f82af513a9af7558eb501 |
|
MD5 | f8cce681c36c0c93fea93cee9867f859 |
|
BLAKE2b-256 | 03fc150efe2625439041407222c222d92ea477cc99132d0626f7e9e7a34bc3fe |