Simple, tiny and ridiculus ETL made with Python
Project description
DasLaden is a simple, tiny and ridiculus ETL made with Python
Dasladen is a general purpose Python package to make an automate ETL (Extracting, Transforming and
Loading data) through the configuration of one or more .json files that represents tasks.
It is based on petl. It can do some tasks like:
- load a .csv file to database table
- run a database query into a .csv file
- run a database query into a database table
- convert a .csv file into another .csv file
- convert a .xls file into a .csv file
- load a .xml file into a database table
- convert a .xls file into a .csv file
This tasks can be configured to do some basic transformations offer by petl and you can write your own
transformations in a Python module or class to be called by Dasladen during loading process.
There is others types of tasks to do things like:
- Compact files into .zip file
- Extract files from .zip file
- Upload a file
- Download a file
- Execute a Python script
- Execute a SQL command
The tasks are configured in a .json file that supports a sequence of tasks that will be executed
in configured order. Details of how to configure tasks will be in Wiki pages.
The basic steps to use DasLaden is:
- Install dasladen package via
pip install dasladenin your environment or in virtualenv. - Install database driver package if you want to execute database tasks. Dasladen is prepared to run with the
following drivers: MySQL via
PyMySQL, MS SQL Server viapyodbcand Oracle viacx_Oracle. Please see the limitations on the driver package that you choose. - Create a folder for you project.
- Prepare a folder structure in project folder with following names:
inputIs the default folder to put input files, like .csv, .xml, .xls and .sql filesoutputIs the default folder that tasks write target filesmoduleIs the folder for python scripts if you can't put then in project foldercaptureIs the default folder to drop task files (.json or .zip)logIs the folder that Dasladen write task logstasksIs the folder that you can put tasks files. It is only a suggestion.
- Create a
.jsonfile with your tasks intasksfolder. - Start DasLaden from project folder calling
python -m dasladen. - If you want to see log in console window, pass a
--verboseas argument on call. - Copy the
.jsontasks file fromtasksto thecapturefolder.
The watcher will open the tasks file and process it. To see result you can open log folder and search
for watcher_DD_TT.log where DD_TT is the date and time that log was generated. In log folder you
can see individual tasks logs too.
It is important that you copy the task file instead move it, because on finish it will be deleted.
If you drop a file other than .zip in capture folder, that file will be move to input folder.
You can zip the .json file with all other dependent files (.csv, .xls, etc.) and copy
that zip into capture folder too. Watcher will unzip then at a temporary folder, copy input
files (other than .json files) to input folder and execute the .json file.
In the .json file you can configure a scheduler to run the tasks. With it you can delay a execution or
configure its recurrence.
Data drivers via PyPi packages:
- MySQL via PyMySQL package. v >= 0.7.5
- MS SQL Server via pyodbc package. v >= 3.0.10
- Oracle via cx_Oracle package. v >= 5.2.1
- PostgreSQL via psycopg2 package. v >= 2.8.3
The current version works with Python 2 and 3.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dasladen-0.2.1.tar.gz.
File metadata
- Download URL: dasladen-0.2.1.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d28143f9a3c18ecd9f1c8bd8fe97299785faad07cae9766e0b2e5114b1b0cc8
|
|
| MD5 |
c62e8197e88b4fceef5054f9bb596e2c
|
|
| BLAKE2b-256 |
1d02756df7a3d068b6cd7d33aeae1321d24881d9ddb30e67143169f4cdfa04a9
|
File details
Details for the file dasladen-0.2.1-py3-none-any.whl.
File metadata
- Download URL: dasladen-0.2.1-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
725d2b5002a38fe134707de114054ee7a2f92782d382e107a4f4b23c51239444
|
|
| MD5 |
877762118396c5cad3050c7a2c612a5a
|
|
| BLAKE2b-256 |
a6b2d2ed537c00760ad3b55d820676b22c93c9647535afa9bd7d3ce4f401b8b0
|
File details
Details for the file dasladen-0.2.1-py2-none-any.whl.
File metadata
- Download URL: dasladen-0.2.1-py2-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91cc0382b1d202037808af3ee9ddaf6372c28bdccfdd9297e67909458d913303
|
|
| MD5 |
7e04f45ee6f6376f7bfd14d4b064c982
|
|
| BLAKE2b-256 |
ceed829e5c767c6ec60702c8170f5d009563a0ff2d2ea3f388bfa59ca410072b
|