Skip to main content

Framework to build data pipelines

Project description

luisy

Test Package Test docs PyPI

This tool is an extension for the Python Framework luigi which helps to build reproducable and complex data pipelines for batch jobs. Visit our docs to learn more!


How to use?

This is how an end-to-end luisy pipeline may look like:

    import luisy
    import pandas as pd
    
    @luisy.raw
    @luisy.csv_output(delimiter=',')
    class InputFile(luisy.ExternalTask):
        label = luisy.Parameter()
    
        def get_file_name(self): 
            return f"file_{self.label}"
    
    @luisy.interim
    @luisy.requires(InputFile)
    class ProcessedFile(luisy.Task):
        def run(self):
            df = self.input().read()
            # Some more preprocessings
            # ...
            # Write to disk
            self.write(df)
    
    @luisy.final
    class MergedFile(luisy.ConcatenationTask):
        def requires(self):
            for label in ['a', 'b', 'c', 'd']:
                yield ProcessedFile(label=label)

How to install?

Stable Branch: main

Minimum python version: 3.8

Install luisy with

pip install luisy

How to test?

To run all unittests that are inside the tests directory use the following command:

pytest

How to contribute?

Please have a look at our contribution guide.

Third-Party Licenses

Runtime dependencies

Name License Type
numpy BSD-3-Clause License Dependency
pandas BSD 3-Clause License Dependency
networkx BSD-3-Clause License Dependency
luigi Apache License 2.0 Dependency
distlib Python license Dependency
matplotlib Other Dependency
azure-storage-blob MIT License Dependency
tables BSD license Dependency
pipdeptree MIT License Dependency
requirements-parser Apache License 2.0 Dependency
pyarrow Apache License 2.0 Dependency
spark Apache License 2.0 Dependency

Development dependency

Name License Type
sphinx BSD-2-Clause Dependency
sphinx_rtd_theme MIT License Dependency
flake8 MIT License Dependency
pytest MIT License Dependency
pytest-flake8 BSD License Dependency
pytest-cov MIT License Dependency
pip-tools BSD 3-Clause License Dependency

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

luisy-1.4.3.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

luisy-1.4.3-py2.py3-none-any.whl (46.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file luisy-1.4.3.tar.gz.

File metadata

  • Download URL: luisy-1.4.3.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for luisy-1.4.3.tar.gz
Algorithm Hash digest
SHA256 8a1ed7071b9284f5f4a5b0c0754f0d7157b5ddb1ab87c661cf6fd9edef2ddbdc
MD5 4869eac9340f05f960eb18dd0ae0de36
BLAKE2b-256 2eb674680540976bd19c7a39db8a3983ec79b0304f9893eb829aa64893f0b507

See more details on using hashes here.

File details

Details for the file luisy-1.4.3-py2.py3-none-any.whl.

File metadata

  • Download URL: luisy-1.4.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 46.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for luisy-1.4.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a0879ff3b12b89bc6fec43436e148241364b5113b489abe1bf6c06339fd5a080
MD5 4a3c10933f2b9e1a76cba4c959e7e798
BLAKE2b-256 4b5964712f07d4c0b53241e82ebe49a4ab20d8cfcc4c1f78896ed3201a561172

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page