Framework to build data pipelines
Project description
luisy
This tool is an extension for the Python Framework luigi which helps to build reproducable and complex data pipelines for batch jobs. Visit our docs to learn more!
How to use?
This is how an end-to-end luisy
pipeline may look like:
import luisy
import pandas as pd
@luisy.raw
@luisy.csv_output(delimiter=',')
class InputFile(luisy.ExternalTask):
label = luisy.Parameter()
def get_file_name(self):
return f"file_{self.label}"
@luisy.interim
@luisy.requires(InputFile)
class ProcessedFile(luisy.Task):
def run(self):
df = self.input().read()
# Some more preprocessings
# ...
# Write to disk
self.write(df)
@luisy.final
class MergedFile(luisy.ConcatenationTask):
def requires(self):
for label in ['a', 'b', 'c', 'd']:
yield ProcessedFile(label=label)
How to install?
Stable Branch: main
Minimum python version: 3.8
Install luisy with
pip install luisy
How to test?
To run all unittests that are inside the tests directory use the following command:
pytest
How to contribute?
Please have a look at our contribution guide.
Third-Party Licenses
Runtime dependencies
Name | License | Type |
---|---|---|
numpy | BSD-3-Clause License | Dependency |
pandas | BSD 3-Clause License | Dependency |
networkx | BSD-3-Clause License | Dependency |
luigi | Apache License 2.0 | Dependency |
distlib | Python license | Dependency |
matplotlib | Other | Dependency |
azure-storage-blob | MIT License | Dependency |
tables | BSD license | Dependency |
pipdeptree | MIT License | Dependency |
requirements-parser | Apache License 2.0 | Dependency |
pyarrow | Apache License 2.0 | Dependency |
spark | Apache License 2.0 | Dependency |
Development dependency
Name | License | Type |
---|---|---|
sphinx | BSD-2-Clause | Dependency |
sphinx_rtd_theme | MIT License | Dependency |
flake8 | MIT License | Dependency |
pytest | MIT License | Dependency |
pytest-flake8 | BSD License | Dependency |
pytest-cov | MIT License | Dependency |
pip-tools | BSD 3-Clause License | Dependency |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
luisy-1.4.3.tar.gz
(40.7 kB
view details)
Built Distribution
luisy-1.4.3-py2.py3-none-any.whl
(46.8 kB
view details)
File details
Details for the file luisy-1.4.3.tar.gz
.
File metadata
- Download URL: luisy-1.4.3.tar.gz
- Upload date:
- Size: 40.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a1ed7071b9284f5f4a5b0c0754f0d7157b5ddb1ab87c661cf6fd9edef2ddbdc |
|
MD5 | 4869eac9340f05f960eb18dd0ae0de36 |
|
BLAKE2b-256 | 2eb674680540976bd19c7a39db8a3983ec79b0304f9893eb829aa64893f0b507 |
File details
Details for the file luisy-1.4.3-py2.py3-none-any.whl
.
File metadata
- Download URL: luisy-1.4.3-py2.py3-none-any.whl
- Upload date:
- Size: 46.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a0879ff3b12b89bc6fec43436e148241364b5113b489abe1bf6c06339fd5a080 |
|
MD5 | 4a3c10933f2b9e1a76cba4c959e7e798 |
|
BLAKE2b-256 | 4b5964712f07d4c0b53241e82ebe49a4ab20d8cfcc4c1f78896ed3201a561172 |