Tool for replication of MySQL databases to ClickHouse
Project description
mysql_ch_replicator
mysql_ch_replicator
is a powerful and efficient tool designed for real-time replication of MySQL databases to ClickHouse.
With a focus on high performance, it utilizes batching heavily and uses C++ extension for faster execution. This tool ensures seamless data integration with support for migrations, schema changes, and correct data management.
Features
- Real-Time Replication: Keeps your ClickHouse database in sync with MySQL in real-time.
- High Performance: Utilizes batching and ports slow parts to C++ (e.g., MySQL internal JSON parsing) for optimal performance.
- Supports Migrations/Schema Changes: Handles adding, altering, and removing tables without breaking the replication process.
- Recovery without Downtime: Allows for preserving old data while performing initial replication, ensuring continuous operation.
- Correct Data Removal: Unlike MaterializedMySQL,
mysql_ch_replicator
ensures physical removal of data. - Comprehensive Data Type Support: Accurately replicates most data types, including JSON, booleans, and more. Easily extensible for additional data types.
- Multi-Database Handling: Replicates the binary log once for all databases, optimizing the process compared to
MaterializedMySQL
, which replicates the log separately for each database.
Installation
To install mysql_ch_replicator
, use the following command:
pip install mysql_ch_replicator
You may need to also compile C++ components if they're not pre-built for your platform.
Usage
Basic Usage
To start the replication process:
- Prepare config file. Use
example_config.yaml
as an example. - Start the replication:
mysql_ch_replicator --config config.yaml run_all
Configuration
mysql_ch_replicator
can be configured through a configuration file. Here is the config example:
mysql:
host: 'localhost'
port: 8306
user: 'root'
password: 'root'
clickhouse:
host: 'localhost'
port: 8323
user: 'default'
password: 'default'
binlog_replicator:
data_dir: '/home/user/binlog/'
records_per_file: 100000
databases: 'database_name_pattern_*'
mysql
MySQL connection settingsclickhouse
ClickHouse connection settingsbinlog_replicator.data_dir
Directory for store binary log and application statedatabases
Databases name pattern to replicate, egdb_*
will matchdb_1
db_2
db_test
Advanced Features
Migrations & Schema Changes
mysql_ch_replicator
supports the following:
- Adding Tables: Automatically starts replicating data from newly added tables.
- Altering Tables: Adjusts replication strategy based on schema changes.
- Removing Tables: Handles removal of tables without disrupting the replication process.
Recovery Without Downtime
In case of a failure or during the initial replication, mysql_ch_replicator
will preserve old data and continue syncing new data seamlessly. You could remove the state and restart replication from scratch.
Development
To contribute to mysql_ch_replicator
, clone the repository and install the required dependencies:
git clone https://github.com/your-repo/mysql_ch_replicator.git
cd mysql_ch_replicator
pip install -r requirements.txt
Running Tests
For running test you will need:
- MySQL and ClickHouse server
config.yaml
that will be used during tests- Run tests with:
pytest -v -s test_mysql_ch_replicator.py
Contribution
Contributions are welcome! Please open an issue or submit a pull request for any bugs or features you would like to add.
License
mysql_ch_replicator
is licensed under the MIT License. See the LICENSE file for more details.
Acknowledgements
Thank you to all the contributors who have helped build and improve this tool.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mysql_ch_replicator-0.0.2.tar.gz
.
File metadata
- Download URL: mysql_ch_replicator-0.0.2.tar.gz
- Upload date:
- Size: 106.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.13 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d46a913933c44bafcb13cc5bd32382bfa5587d7e514cb89ee4d2ffece9bc173 |
|
MD5 | 8bf54406ee3564a1bdd390b69f9908be |
|
BLAKE2b-256 | f8aec9714a42d3ea268f5210d47a13a60750589296b886a9164f284d58cbc95f |
File details
Details for the file mysql_ch_replicator-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: mysql_ch_replicator-0.0.2-py3-none-any.whl
- Upload date:
- Size: 119.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.13 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1ddb02b8ff6916d1b520a6fa6747f9b716a88e49893d51efb09b64cab9a1662 |
|
MD5 | 04892eab904506e4773c33278f546fdf |
|
BLAKE2b-256 | c8908d0e1aa1e52fdfdf59a4ecc582eac9c751a3039cc1b4b1a8371f5140dd9f |