Skip to main content

Python Powered ETL

Project description

Extral

Extral is a versatile ETL (Extract, Transform, Load) application designed to move data from a source database to a destination database.

Supported sources:

  • MySQL / MariaDB
  • PostgreSQL

Supported destinations:

  • MySQL / MariaDB
  • PostgreSQL

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Configuration

Extral uses YAML configuration files to define the ETL process. Below is a sample configuration format.

Specify the configuration file with --config <file.yaml> when running the application.

Config file format

logging:
- level: info

source:
 - type: mysql
   host: localhost
   port: 3306
   user: root
   password: example_password
   database: example_db
   charset: utf8mb4

tables:
  - name: customers
    batch_size: 100
    strategy: merge
    merge_key: id
    incremental:
      field: updated_on
      type: datetime
      initial_value: '2022-01-01T00:00:00'
  - name: orders
    strategy: append
  - name: order_types
    strategy: replace

destination:
  - type: postgresql
    host: localhost
    port: 5432
    user: loader
    password: example_password
    database: example_db
    schema: public

Incremental data loading

Extral supports incremental data loading, which uses a cursor to track the data that has already been extracted.

Sample configuration:

tables:
  - name: customers
    batch_size: 100
    incremental:
      field: updated_on
      type: datetime
      initial_value: '2022-01-01T00:00:00'

In the example above, the table customers is configured to use a cursor based on the field 'updated_on'. The type and an initial_value are specified. During the first extraction, only records with an 'updated_on' field later than January 1st, 2022 will be included. Subsequent extractions will use the last value seen in the 'updated_on' field to extract new records.

Data loading strategies

Extral supports three strategies for loading data into the destination database:

Merge

The merge strategy updates existing records in the destination table based on a specified merge_key and inserts new records that do not already exist. This strategy is ideal for maintaining up-to-date data while avoiding duplication.

Sample configuration:

tables:
  - name: customers
    strategy: merge
    merge_key: id

Replace

The replace strategy truncates the destination table before loading new data. This ensures that the destination table contains only the latest data extracted from the source. Use this strategy when you want to completely overwrite the existing data.

Sample configuration:

tables:
  - name: order_types
    strategy: replace

Append

The append strategy adds new records to the destination table without modifying or removing existing records. This is useful for scenarios where historical data needs to be preserved and new data is simply added to the table.

Sample configuration:

tables:
  - name: orders
    strategy: append

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extral-0.1.0.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

extral-0.1.0-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file extral-0.1.0.tar.gz.

File metadata

  • Download URL: extral-0.1.0.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for extral-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f7c88ae83ec3c8f10b45a86a9a51aae53e842f59fa48b59d909c07a03538d070
MD5 5a8632af1d82a7194914c81b9470429d
BLAKE2b-256 9108fe7b3c8c16ca6bfe828b49f2382b24d09e5b01964fb4be86c4713b6aa8fa

See more details on using hashes here.

File details

Details for the file extral-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: extral-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for extral-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 54efc71c28dbe516f7686d2c3299a97da32353e0817119808bf8afaba7076905
MD5 1c360a81ea6fa7968cf16db1ccf93f52
BLAKE2b-256 f016ec7078dc01506abb03add0352fb0a844dfddb48ed130ea2a3ad31ae42fa8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page