A Change Data Capture (CDC) library for data synchronization
Project description
EvolvisHub Data Handler
A robust Change Data Capture (CDC) library for efficient data synchronization across various databases and storage systems.
Features
- Multi-Database Support: Seamlessly sync data between PostgreSQL, MySQL, SQL Server, Oracle, MongoDB, and more
- Cloud Storage Integration: Native support for AWS S3, Google Cloud Storage, and Azure Blob Storage
- File System Support: Handle CSV, JSON, and other file formats
- Watermark Tracking: Efficient incremental sync with configurable watermark columns
- Batch Processing: Optimize performance with configurable batch sizes
- Error Handling: Robust error recovery and logging
- Type Safety: Full type hints and validation with Pydantic
- Extensible: Easy to add new adapters and data sources
Installation
# Install from PyPI
pip install evolvishub-data-handler
# Install with development dependencies
pip install evolvishub-data-handler[dev]
# Install with documentation dependencies
pip install evolvishub-data-handler[docs]
Quick Start
- Create a configuration file (e.g.,
config.yaml):
source:
type: postgresql
host: localhost
port: 5432
database: source_db
user: source_user
password: source_password
watermark:
column: updated_at
type: timestamp
initial_value: "1970-01-01 00:00:00"
destination:
type: postgresql
host: localhost
port: 5432
database: dest_db
user: dest_user
password: dest_password
watermark:
column: updated_at
type: timestamp
initial_value: "1970-01-01 00:00:00"
sync:
batch_size: 1000
interval_seconds: 60
watermark_table: sync_watermark
- Use the library in your code:
from evolvishub_data_handler import CDCHandler
# Initialize the handler
handler = CDCHandler("config.yaml")
# Run one-time sync
handler.sync()
# Or run continuous sync
handler.run_continuous()
- Or use the command-line interface:
# One-time sync
evolvishub-cdc -c config.yaml
# Continuous sync
evolvishub-cdc -c config.yaml -m continuous
# With custom logging
evolvishub-cdc -c config.yaml -l DEBUG --log-file sync.log
Supported Data Sources
Databases
- PostgreSQL
- MySQL
- SQL Server
- Oracle
- MongoDB
Cloud Storage
- AWS S3
- Google Cloud Storage
- Azure Blob Storage
File Systems
- CSV files
- JSON files
- Parquet files
Development
Setup
- Clone the repository:
git clone https://github.com/evolvishub/evolvishub-data-handler.git
cd evolvishub-data-handler
- Create a virtual environment:
make venv
- Install development dependencies:
make install
- Install pre-commit hooks:
make install-hooks
Testing
Run the test suite:
make test
Code Quality
Format code:
make format
Run linters:
make lint
Building
Build the package:
make build
Contributing
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Documentation: https://evolvishub.github.io/evolvishub-data-handler
- Issues: https://github.com/evolvishub/evolvishub-data-handler/issues
- Email: info@evolvishub.com
EvolvisHub Data Handler Adapter
A powerful and flexible data handling adapter for Evolvis AI's data processing pipeline. This tool provides seamless integration with various database systems and implements Change Data Capture (CDC) functionality.
About Evolvis AI
Evolvis AI is a leading provider of AI solutions that helps businesses unlock their data potential. We specialize in:
- Data analysis and decision-making
- Machine learning implementation
- Process optimization
- Predictive maintenance
- Natural language processing
- Custom AI solutions
Our mission is to make artificial intelligence accessible to businesses of all sizes, enabling them to compete in today's data-driven environment. As Forbes highlights: "Organizations that strategically adopt AI will have a significant competitive advantage in today's data-driven market."
Author
Alban Maxhuni, PhD
Email: a.maxhuni@evolvis.ai
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file evolvishub_data_handler-0.1.0.tar.gz.
File metadata
- Download URL: evolvishub_data_handler-0.1.0.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f74e7c2958306598006c9aff8b6b467f6e24fc198a7cc974b9645c3d9f93ea9e
|
|
| MD5 |
0424684ff65b207a12eb13172ed6cf45
|
|
| BLAKE2b-256 |
4076407035022d5b143bbeeab790e856f34aee7091e7acac108ab37c9cb63d9c
|
File details
Details for the file evolvishub_data_handler-0.1.0-py3-none-any.whl.
File metadata
- Download URL: evolvishub_data_handler-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc87de9bd1ac6ab212fa29dda7ff15875895f0a229a4801e6d4c1018201c19a1
|
|
| MD5 |
710a7de152d7ebf082a93ed72e1e3a70
|
|
| BLAKE2b-256 |
b89241788b24912bb9efb2fbc0e1d821d091d754304806d9b52c1c4ec2d23744
|