Skip to main content

BeETL is a Python package for extracting data from one datasource, transforming it and loading it into another datasource.

Project description

BeETL: Extensible Python/Polars-based ETL Framework


BeETL was born from a job as Integration Developer where a majority of the integrations we develop follow the same pattern - get here, transform a little, put there (with the middle step frequently missing altogether).

After building our 16th integration between the same two systems with another manual template, we decided to build BeETL. BeETL is currently limited to one datasource per source and destination per sync, but this will be expanded in the future. One configuration can contain multiple syncs.

Note: Even though some of the configuration below is in YAML format, you can also use JSON or a python dictionary.

TOC

Minimal example

# Syncing users from one table to another in the same database
from src.beetl.beetl import Beetl, BeetlConfig
config = BeetlConfig({
    "version": "V1"
    "sources": [
        {
            "name": "Sqlserver",
            "type": "Sqlserver",
            "connection": {
                "settings": {
                    "connection_string": "Server=myServerAddress;Database=myDataBase;User Id=myUsername;Password=myPassword;"
                }
            }
        },
    "sync": [
        {
            "name": "Sync between two tables in a sql server",
            "source": "Sqlserver",
            "sourceConfig": {
                "query": "SELECT id, name, email FROM users"
            }
            "destination": "SqlServer",
            "destinationConfig": {
                "table": "users",
                "unique_columns": ["id"]
            }
            "comparisonColumns": [
                {
                    "name": "id",
                    "type": "Int32",
                    "unique": True
                },
                {
                    "name": "name",
                    "type": "Utf8"
                },
                {
                    "name": "email",
                    "type": "Utf8"
                }
            ]
        }
    ]
})

Beetl(config).sync()

Installation

From PyPi

#/bin/bash
python -m pip install beetl

# If you need to use xsl transformations
python -m pip install beetl[xsl]

From Source

#/bin/bash
# Clone and enter the repository
git clone https://github.com/Hoglandets-IT/beetl.git
cd ./beetl
# Install the build tools
python -m pip install build
# Build beetl
python -m build
# Install beetl from locally built package
python -m pip install ./dist/*.tar.gz

Getting Started

All the latest information about how to use beetl is located at the official docs.

Development Environment

The easiest way to get started is to use the included devcontainer.

Requirements

  • Docker
  • Visual Studio Code

Steps

  1. Clone the repository.
  2. Open the repository in Visual Studio Code.
  3. Install the recommended extensions.
  4. Using the command palette (ctrl+shift+p) search for reopen in container and run it.
    • The devcontainer will now be provisioned in your local docker instance and vscode will automatically connect to it.
  5. You can now use the included launch profiles to either open the docs or run the tests file.
  6. You can also use the built-in test explorer to run the available test.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beetl-1.4.1rc1.tar.gz (56.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beetl-1.4.1rc1-py3-none-any.whl (87.1 kB view details)

Uploaded Python 3

File details

Details for the file beetl-1.4.1rc1.tar.gz.

File metadata

  • Download URL: beetl-1.4.1rc1.tar.gz
  • Upload date:
  • Size: 56.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for beetl-1.4.1rc1.tar.gz
Algorithm Hash digest
SHA256 2282a575a9ec418bfc8a38f810773d1040fd859440f8c5e5bd21c1c1a2deaa00
MD5 be9bb1d1c717e63eb259950cd554b37d
BLAKE2b-256 a013484378a2e212a39c44f4f811382c2cdc5c0ac31764cea8cd6f1b3b75d3e6

See more details on using hashes here.

File details

Details for the file beetl-1.4.1rc1-py3-none-any.whl.

File metadata

  • Download URL: beetl-1.4.1rc1-py3-none-any.whl
  • Upload date:
  • Size: 87.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for beetl-1.4.1rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 ddec7391c67fb8c46e79b5f184f63ef78bc58b58801b99d9448b1a22bcfe428b
MD5 9c17aa5f37295c9edb3b67199e35f70a
BLAKE2b-256 6bab225916bc8477764ec480b4cf57f9f25414de54a6e90573218502ae317843

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page