Skip to main content

Supplements to the python SPark ETL libRary (SPETLR) for Databricks.

Project description

spetlr-tools

Table of Contents

Description

SPETLR-tools is a library that provides a set of tools for working with Databricks Lakehouses. These tools include test fixtures and development utilities that are not part of the runtime tools in SPETLR.

Visit the official SPETLR webpage: https://spetlr.com/

Purpose of SPETLR-tools

SPETLR-tools is designed to support SPETLR in various scenarios, including:

  • Test tools in pytest:
    • Examples: Dataframe validation checks, Data format checking, ...
  • Helpers for investigating data:
    • Examples: Extract schema from binary encoded columns, Get the difference between two dataframes , ...
  • SPETLR-tools CLI:
    • Examples: Submit pytests to Databricks cluster, Automated Azure Token extraction, ...

SPETLR-tools vs. SPETLR

  • SPETLR-tools: Tested in a Python interpreter and per january 2024 also integration tested using on-cluster job tests.
  • SPETLR-tools: Github workflow have an very simple Azure Deployment
  • SPETLR: Fully unit and integration tested - library ready for production use
  • SPETLR-tools: Supports deployment and testing
    • Use only in test_requirements.txt

Installation

Install SPETLR from PyPI:

PyPI version PyPI

pip install spetlr-tools

Development Notes

To prepare for development, please install following additional requirements:

  • Java 8
  • pip install -r test_requirements.txt

Then install the package locally:

python setup.py develop

Testing

Local tests

After installing the dev-requirements, execute tests by running:

pytest tests

These tests are located in the ./tests/unit folder and only require a Python interpreter. Pull requests will not be accepted if these tests do not pass. If you add new features, please include corresponding tests.

CLI and Cluster tests

During the pre-integration workflow (.gitub/workflows/pre-integration.yml) spetlr-tool supported CLI are (should) be tested.

General Project Info

Github top language Github stars Github forks Github size Issues Open PyPI spetlr badge

Contributing

Feel free to contribute to SPETLR-tools. Any contributions are appreciated - not only new features, but also if you find a way to improve SPETLR-tools.

If you have a suggestion that can enhance SPETLR-tools, please fork the repository and create a pull request. Alternatively, you can open an issue with the "enhancement" tag.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/NewSPETLRToolsFeature)
  3. Commit your Changes (git commit -m 'Add some SEPTLRToolsFeature')
  4. Push to the Branch (git push origin feature/NewSPETLRToolsFeature)
  5. Open a Pull Request

Build Status

Post-Integration

Releases

Releases to PyPI is an Github Action which needs to be manually triggered.

Release PyPI spetlr badge

Contact

For any inquiries, please use the SPETLR Discord Server.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spetlr-tools-0.1.66.tar.gz (33.4 kB view details)

Uploaded Source

Built Distribution

spetlr_tools-0.1.66-py3-none-any.whl (45.6 kB view details)

Uploaded Python 3

File details

Details for the file spetlr-tools-0.1.66.tar.gz.

File metadata

  • Download URL: spetlr-tools-0.1.66.tar.gz
  • Upload date:
  • Size: 33.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for spetlr-tools-0.1.66.tar.gz
Algorithm Hash digest
SHA256 a05e032b11b2469acae662bbca1f1d8ef685c7fe680f395476a6dac32c92ca0f
MD5 44cd1b6f2b3c684ab495bc22017f1c53
BLAKE2b-256 7dc3637b9a24c0d420225d54bdd4de36d85b4bd39cabab62ff3ef3d7352f50d0

See more details on using hashes here.

File details

Details for the file spetlr_tools-0.1.66-py3-none-any.whl.

File metadata

File hashes

Hashes for spetlr_tools-0.1.66-py3-none-any.whl
Algorithm Hash digest
SHA256 cf4d6118379790281307009e6b3a0dbfb3937440adfca56a938d73a6bb33e96f
MD5 76786074c9b157fcd89a02082bd5644a
BLAKE2b-256 ae66cbb74d223cf6e63b0810ecf1481bd57ed70ac59d337ef2837e82650c9034

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page