Skip to main content

Framework to build data pipelines

Project description

luisy

Test Package Test docs PyPI

This tool is an extension for the Python Framework luigi which helps to build reproducable and complex data pipelines for batch jobs. Visit our docs to learn more!


How to use?

This is how an end-to-end luisy pipeline may look like:

    import luisy
    import pandas as pd
    
    @luisy.raw
    @luisy.csv_output(delimiter=',')
    class InputFile(luisy.ExternalTask):
        label = luisy.Parameter()
    
        def get_file_name(self): 
            return f"file_{self.label}"
    
    @luisy.interim
    @luisy.requires(InputFile)
    class ProcessedFile(luisy.Task):
        def run(self):
            df = self.input().read()
            # Some more preprocessings
            # ...
            # Write to disk
            self.write(df)
    
    @luisy.final
    class MergedFile(luisy.ConcatenationTask):
        def requires(self):
            for label in ['a', 'b', 'c', 'd']:
                yield ProcessedFile(label=label)

How to install?

Stable Branch: main

Minimum python version: 3.8

Install luisy with

pip install luisy

How to test?

To run all unittests that are inside the tests directory use the following command:

pytest

How to contribute?

Please have a look at our contribution guide.

Third-Party Licenses

Runtime dependencies

Name License Type
numpy BSD-3-Clause License Dependency
pandas BSD 3-Clause License Dependency
networkx BSD-3-Clause License Dependency
luigi Apache License 2.0 Dependency
distlib Python license Dependency
matplotlib Other Dependency
azure-storage-blob MIT License Dependency
tables BSD license Dependency
pipdeptree MIT License Dependency
requirements-parser Apache License 2.0 Dependency
pyarrow Apache License 2.0 Dependency

Development dependency

Name License Type
sphinx BSD-2-Clause Dependency
sphinx_rtd_theme MIT License Dependency
flake8 MIT License Dependency
pytest MIT License Dependency
pytest-flake8 BSD License Dependency
pytest-cov MIT License Dependency
pip-tools BSD 3-Clause License Dependency

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

luisy-1.2.1.tar.gz (38.3 kB view hashes)

Uploaded Source

Built Distribution

luisy-1.2.1-py2.py3-none-any.whl (44.4 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page