Skip to main content

Python Powered ETL

Project description

Extral

PyPI version Code Quality Checks

Extral is a versatile ETL (Extract, Transform, Load) application designed to move data from a source database to a destination database.

Supported Connectors:

  • MySQL / MariaDB
    Both source and destination
  • PostgreSQL
    Both source and destination

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Configuration

Extral uses YAML configuration files to define the ETL process. Below is a sample configuration format.

Specify the configuration file with --config <file.yaml> when running the application.

Config file format

logging:
- level: info

source:
 - type: mysql
   host: localhost
   port: 3306
   user: root
   password: example_password
   database: example_db
   charset: utf8mb4

tables:
  - name: customers
    batch_size: 100
    strategy: merge
    merge_key: id
    incremental:
      field: updated_on
      type: datetime
      initial_value: '2022-01-01T00:00:00'
  - name: orders
    strategy: append
  - name: order_types
    strategy: replace

destination:
  - type: postgresql
    host: localhost
    port: 5432
    user: loader
    password: example_password
    database: example_db
    schema: public

Incremental data loading

Extral supports incremental data loading, which uses a cursor to track the data that has already been extracted.

Sample configuration:

tables:
  - name: customers
    batch_size: 100
    incremental:
      field: updated_on
      type: datetime
      initial_value: '2022-01-01T00:00:00'

In the example above, the table customers is configured to use a cursor based on the field 'updated_on'. The type and an initial_value are specified. During the first extraction, only records with an 'updated_on' field later than January 1st, 2022 will be included. Subsequent extractions will use the last value seen in the 'updated_on' field to extract new records.

Data loading strategies

Extral supports three strategies for loading data into the destination database:

Merge

The merge strategy updates existing records in the destination table based on a specified merge_key and inserts new records that do not already exist. This strategy is ideal for maintaining up-to-date data while avoiding duplication.

Sample configuration:

tables:
  - name: customers
    strategy: merge
    merge_key: id

Replace

The replace strategy truncates the destination table before loading new data. This ensures that the destination table contains only the latest data extracted from the source. Use this strategy when you want to completely overwrite the existing data.

Sample configuration:

tables:
  - name: order_types
    strategy: replace

Append

The append strategy adds new records to the destination table without modifying or removing existing records. This is useful for scenarios where historical data needs to be preserved and new data is simply added to the table.

Sample configuration:

tables:
  - name: orders
    strategy: append

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extral-0.1.1.tar.gz (45.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

extral-0.1.1-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file extral-0.1.1.tar.gz.

File metadata

  • Download URL: extral-0.1.1.tar.gz
  • Upload date:
  • Size: 45.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for extral-0.1.1.tar.gz
Algorithm Hash digest
SHA256 745b34eb587270726ada42faa8289cb4de1d8ad6172eb5ea11d2487ba38f92ae
MD5 c71a5a550a625327574260ce3e538ed0
BLAKE2b-256 c7d27e982b987a5b96c966c8ed06e6350d4762031f8a34e9ba7f510a74e7cdb2

See more details on using hashes here.

File details

Details for the file extral-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: extral-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for extral-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8760ce7ac879feef7705ee294979f17e4e2515a12dc2574ce348079fcee1be3f
MD5 ba58d460ffa3290b3ad9a6f835c426cf
BLAKE2b-256 b3140e5fcafa00bbcee7238bc18687e5f737c8d337f81ccfc35f12f8f94d96ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page