Skip to main content

application framework for ETL(ELT) processing

Project description

PyPI PyPI - Implementation PyPI - Python Version GitHub Actions Code Style: black Contributions Welcome Repo Size Gitter

Table of Contents

Introduction

What is cliboa

cliboa is an application framework which can implement ETL(ELT) pipeline. It eases the implementation of ETL(ELT) pipeline. In this case, ETL(ELT) pipeline means the processings like fetch, transform and transfer of data between various databases, storages, and other services.

Features

  • Python based framework.
  • ETL(ELT) processing is executable by YAML based configuration.
  • Additional modules for ETL(ELT) pipeline can be implemented by only a few steps if default modules not enough.

Manual

See MANUAL.md

How to Contribute

See CONTRIBUTING.md

Quick Start

Requirements

Available on macOS and any Linux distributions, like Debian, Ubuntu, CentOS, REL, or etc.

Install cliboa

python version 3.7 or later and poetry are required. In the environemnt which pip can be used, execute as below.

sudo pip3 install poetry
sudo pip3 install cliboa

Configuration of a Simple ETL Processing

After installed cliboa, 'cliboadmin' can be used as an administrator command.

Create an executable environment of cliboa by using cliboadmin.

$ cd /usr/local
$ sudo cliboadmin init sample
$ cd sample
$ sudo cliboadmin create simple-etl

Directory Tree

Directory tree which was created aforementioned commands is as below.

sample
├── pyproject.toml
├── bin
│   └── clibomanager.py
├── cliboa
│   └── conf
├── common
│   ├── __init__.py
│   ├── environment.py
│   └── scenario
├── conf
│   ├── cliboa.ini
│   └── logging.conf
├── logs
├── project
│   └── simple-etl
│       ├── scenario
│       └── scenario.yml

Install PyPI packages

$ cd sample
$ poetry install

Write a Scenario of ETL Processing

As a simple etl processing, write scenario.yml in simple-etl as below.

The following example is just download a gzip file from the local sftp server, decompress it, and upload it to the local sftp server.

See Examples

Set an Environment

To make the above scenario available, set a local machine as a sftp server according to respective environments. Also, put "test.csv.gz" under /usr/local.

Execute a Scenario of ETL Processing

After wrote scenario.yml and set the environment, execute a scenario by as below command.

cd sample
poetry run python3 bin/clibomanager.py simple-etl

YAML Configuration

see yaml_configuration.md

Default ETL Modules

see default_etl_modules.md

How to Implement Additional ETL Modules

see additional_etl_modules.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cliboa-2.4.1b0.tar.gz (59.4 kB view details)

Uploaded Source

Built Distribution

cliboa-2.4.1b0-py3-none-any.whl (108.8 kB view details)

Uploaded Python 3

File details

Details for the file cliboa-2.4.1b0.tar.gz.

File metadata

  • Download URL: cliboa-2.4.1b0.tar.gz
  • Upload date:
  • Size: 59.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for cliboa-2.4.1b0.tar.gz
Algorithm Hash digest
SHA256 c28c85dddf8bd639227d27280805abaddb43a74396c40841d9858e454cd73b22
MD5 cd7ba1b299d7b81ddfecd27ebed307f3
BLAKE2b-256 f637e14e82bdbc070a104b6e0411b0fb527117377ab009ecd472b4ed661f5933

See more details on using hashes here.

File details

Details for the file cliboa-2.4.1b0-py3-none-any.whl.

File metadata

  • Download URL: cliboa-2.4.1b0-py3-none-any.whl
  • Upload date:
  • Size: 108.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for cliboa-2.4.1b0-py3-none-any.whl
Algorithm Hash digest
SHA256 7cd99755f963e961b5e71999c026cf2b0bdb2f32910b3aeb93fc8d774c5f9844
MD5 1ee2aaca88955636aba1666e1c106420
BLAKE2b-256 1f94b8a88f64c4e5d7349f10e8033dc7c25de2a0119af6b775e2c0a97bcc2e82

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page