Python Powered ETL
Project description
Extral
Extral is a versatile ETL (Extract, Transform, Load) application designed to move data from a source database to a destination database.
Supported sources:
- MySQL / MariaDB
- PostgreSQL
Supported destinations:
- MySQL / MariaDB
- PostgreSQL
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Configuration
Extral uses YAML configuration files to define the ETL process. Below is a sample configuration format.
Specify the configuration file with --config <file.yaml> when running the application.
Config file format
logging:
- level: info
source:
- type: mysql
host: localhost
port: 3306
user: root
password: example_password
database: example_db
charset: utf8mb4
tables:
- name: customers
batch_size: 100
strategy: merge
merge_key: id
incremental:
field: updated_on
type: datetime
initial_value: '2022-01-01T00:00:00'
- name: orders
strategy: append
- name: order_types
strategy: replace
destination:
- type: postgresql
host: localhost
port: 5432
user: loader
password: example_password
database: example_db
schema: public
Incremental data loading
Extral supports incremental data loading, which uses a cursor to track the data that has already been extracted.
Sample configuration:
tables:
- name: customers
batch_size: 100
incremental:
field: updated_on
type: datetime
initial_value: '2022-01-01T00:00:00'
In the example above, the table customers is configured to use a cursor based on the field 'updated_on'. The type and an initial_value are specified. During the first extraction, only records with an 'updated_on' field later than January 1st, 2022 will be included. Subsequent extractions will use the last value seen in the 'updated_on' field to extract new records.
Data loading strategies
Extral supports three strategies for loading data into the destination database:
Merge
The merge strategy updates existing records in the destination table based on a specified merge_key and inserts new records that do not already exist. This strategy is ideal for maintaining up-to-date data while avoiding duplication.
Sample configuration:
tables:
- name: customers
strategy: merge
merge_key: id
Replace
The replace strategy truncates the destination table before loading new data. This ensures that the destination table contains only the latest data extracted from the source. Use this strategy when you want to completely overwrite the existing data.
Sample configuration:
tables:
- name: order_types
strategy: replace
Append
The append strategy adds new records to the destination table without modifying or removing existing records. This is useful for scenarios where historical data needs to be preserved and new data is simply added to the table.
Sample configuration:
tables:
- name: orders
strategy: append
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file extral-0.1.0.tar.gz.
File metadata
- Download URL: extral-0.1.0.tar.gz
- Upload date:
- Size: 31.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7c88ae83ec3c8f10b45a86a9a51aae53e842f59fa48b59d909c07a03538d070
|
|
| MD5 |
5a8632af1d82a7194914c81b9470429d
|
|
| BLAKE2b-256 |
9108fe7b3c8c16ca6bfe828b49f2382b24d09e5b01964fb4be86c4713b6aa8fa
|
File details
Details for the file extral-0.1.0-py3-none-any.whl.
File metadata
- Download URL: extral-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54efc71c28dbe516f7686d2c3299a97da32353e0817119808bf8afaba7076905
|
|
| MD5 |
1c360a81ea6fa7968cf16db1ccf93f52
|
|
| BLAKE2b-256 |
f016ec7078dc01506abb03add0352fb0a844dfddb48ed130ea2a3ad31ae42fa8
|