Supplements to the python SPark ETL libRary (SPETLR) for Databricks.
Project description
spetlr-tools
Table of Contents
- Description
- Purpose of spetlr-tools
- Installation
- Development Notes
- Testing
- General Project Info
- Contributing
- Build Status
- Releases
- Contact
Description
SPETLR-tools is a library that provides a set of tools for working with Databricks Lakehouses. These tools include test fixtures and development utilities that are not part of the runtime tools in SPETLR.
Visit the official SPETLR webpage: https://spetlr.com/
Purpose of SPETLR-tools
SPETLR-tools is designed to support SPETLR in various scenarios, including:
- Test tools in pytest:
- Examples: Dataframe validation checks, Data format checking, ...
- Helpers for investigating data:
- Examples: Extract schema from binary encoded columns, Get the difference between two dataframes , ...
- SPETLR-tools CLI:
- Examples: Submit pytests to Databricks cluster, Automated Azure Token extraction, ...
SPETLR-tools vs. SPETLR
- SPETLR-tools: Only tested in a Python interpreter
- SPETLR-tools: Github workflow does not have Azure Deployment
- Consequently, no integration tests on clusters
- SPETLR: Fully unit and integration tested - library ready for production use
- SPETLR-tools: Supports deployment and testing
- Use only in
test_requirements.txt
- Use only in
Installation
Install SPETLR from PyPI:
pip install spetlr-tools
Development Notes
To prepare for development, please install following additional requirements:
- Java 8
pip install -r test_requirements.txt
Then install the package locally:
python setup.py develop
Testing
Local tests
After installing the dev-requirements, execute tests by running:
pytest tests
These tests are located in the ./tests/unit
folder and only require a Python interpreter. Pull requests will not be accepted if these tests do not pass. If you add new features, please include corresponding tests.
General Project Info
Contributing
Feel free to contribute to SPETLR-tools. Any contributions are appreciated - not only new features, but also if you find a way to improve SPETLR-tools.
If you have a suggestion that can enhance SPETLR-tools, please fork the repository and create a pull request. Alternatively, you can open an issue with the "enhancement" tag.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/NewSPETLRToolsFeature
) - Commit your Changes (
git commit -m 'Add some SEPTLRToolsFeature'
) - Push to the Branch (
git push origin feature/NewSPETLRToolsFeature
) - Open a Pull Request
Build Status
Releases
Releases to PyPI is an Github Action which needs to be manually triggered.
Contact
For any inquiries, please use the SPETLR Discord Server.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for spetlr_tools-0.1.52-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d139f0f3b87506ef8b92b774a6730d8a6f86320960804514b6b0379677cbc1d |
|
MD5 | 51d6d690e75c23cd58360ea349e390af |
|
BLAKE2b-256 | 203c312712d86fd9e0488fcfb0a0caadf7e6c5c525328dbd8abab311e94823f1 |