Skip to main content

This library provides the mechanism to implement Change Data Capture (CDC) services...

Project description

core-cdc (CDC a.k.a Change Data Capture)

It provides the core mechanism and required resources to implement “Change Data Capture” services…


Python Versions License Pipeline Status Docs Status Security

Installation

Install from PyPI using pip:

pip install core-cdc
uv pip install core-cdc  # Or using UV...

Features

Multi-Database CDC Support
  • MySQL Binary Log (BinLog) based change capture

  • MongoDB Change Streams for real-time event streaming

  • Extensible processor architecture for additional database engines

Comprehensive Event Handling
  • DML operations: INSERT, UPDATE, DELETE

  • DDL operations: CREATE, ALTER, DROP (schemas and tables)

  • Configurable event filtering by operation type

Flexible Target Replication
  • MySQL target for database-to-database replication

  • Snowflake target for data warehouse integration

  • Support for multiple simultaneous targets

Standardized Data Format
  • Common Record structure for cross-service integration

  • Includes metadata: timestamps, transaction IDs, source position

  • JSON serialization support for streaming and messaging systems

Production-Ready Features
  • Built-in error handling and retry mechanisms

  • Comprehensive logging for monitoring and debugging

  • Optional event timestamp column for UPSERT/MERGE operations

Quick Start

Installation

Install the package:

pip install core-cdc
uv pip install core-cdc     # Or using UV...
pip install -e ".[dev]"     # For development...

Setting Up Environment

  1. Install required libraries:

pip install --upgrade pip
pip install virtualenv
  1. Create Python virtual environment:

virtualenv --python=python3.12 .venv
  1. Activate the virtual environment:

source .venv/bin/activate

Install packages

pip install .
pip install -e ".[dev]"

Optional libraries

pip install '.[all]'  # For all...
pip install '.[mysql]'
pip install '.[mongo]'
pip install '.[snowflake]'

Check tests and coverage

python manager.py run-tests
python manager.py run-tests --test-type integration
python manager.py run-coverage

# Having the docker containers up and running you can execute the functional
# tests that ensure the CDC services are working as expected...
python manager.py run-tests --test-type functional --pattern "*.py"

Implemented CDC Engines

The following database engines have CDC implementations:

Fully Implemented

MySQL - Binary Log (BinLog) based CDC
  • Uses mysql-replication library

  • Captures INSERT, UPDATE, DELETE operations

  • Supports DDL events (CREATE, ALTER, DROP)

  • Fallback mechanism for column name resolution

  • See: core_cdc/processors/mysql/

MongoDB - Change Streams based CDC
  • Uses native MongoDB Change Streams

  • Captures INSERT, UPDATE, DELETE operations

  • Requires replica set configuration

  • Real-time event streaming

  • See: core_cdc/processors/mongo/

In Development

MS SQL Server - Abstract base class defined
Oracle - Abstract base class defined

Contributing

Contributions are welcome! Please:

  1. Fork the repository

  2. Create a feature branch

  3. Write tests for new functionality

  4. Ensure all tests pass: pytest -n auto

  5. Run linting: pylint core_cdc

  6. Run security checks: bandit -r core_cdc

  7. Submit a pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Support

For questions or support, please open an issue on GitLab or contact the maintainers.

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

core_cdc-2.0.2.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

core_cdc-2.0.2-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file core_cdc-2.0.2.tar.gz.

File metadata

  • Download URL: core_cdc-2.0.2.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for core_cdc-2.0.2.tar.gz
Algorithm Hash digest
SHA256 9a1808e3e4a5044cea89457a488d0568d54139dae29ba5cc7f5c03be4381d1b3
MD5 6800908cf056c36d87323221634c7731
BLAKE2b-256 bb6b6bb83e64227275318612b7678b4268b83965c196b1708292e4bd3cec80b2

See more details on using hashes here.

File details

Details for the file core_cdc-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: core_cdc-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for core_cdc-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c90773cc6360c17f35828798e42d1026c9022ffd9484e19aee5021072262a377
MD5 8cda2c14575fe1b3712ae30b3577af68
BLAKE2b-256 121203ab18ac3823304006aa427149bf9f4fecfa2065c4edf5b240c9343c1571

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page