A highly-opinionated Python library for creating ELT data pipelines on Google Cloud.
konekuta is a highly-opinionated Python library for creating ELT data pipelines with Google Cloud.
It is developed to reduce boilerplate code and improve code maintainability across multiple data pipelines leveraging on the same data architecture. Though efforts are made to improve reusability, this data pipeline library currently only supports a highly-opinionated data architecture and is only compatible on Google Cloud Platform.
konekuta supports the following Python versions:
- Python 3.7
- Python 3.8
Documentation is available at https://konekuta.readthedocs.io.
Copyright © 2020 iProspect Singapore. All rights reserved.
All notable changes to this project will be documented in this file.
0.6.0 - 2020-09-02
- persist_state decorator for persisting state across pipeline stages (#163)
- Outdated example not using updated schema format (#164)
0.5.0 - 2020-08-07
- parse_timestamp utility function for parsing raw timestamps (#136)
- Support for advanced backfilling via scheduled_timestamp parameter (#136)
- Support for advanced date ranges via offset parameter (#155)
- Incompatible sed expression on MacOS environments (#138) (#156)
- Drop Python 3.6 support (#137)
0.4.0 - 2020-06-08
- format_dates flag to standardise date format in data files (#92)
- File sizes will now be logged (#93)
- get_unix_timestamp utility function for converting dates to unix time (#94)
- remove_leading_rows and remove_trailing_rows transformer functions (#97) (#120)
- get_row_matching_prefix function to return row number with matching prefix (#98) (#124)
0.3.0 - 2020-05-18
- Calling the Pipeline.run() method automatically cleans up files when done (#83)
- Pipeline.extract() method now supports multiple raw file uploads (#85)
- split_data() function will now log an error if no files were generated (#86)
- Added strict and debug runtime modes (#87)
- Fixed empty variable in log message (#82)
0.2.0 - 2020-04-28
- Added schema helper functions (#51)
- Added function for updating BigQuery table attributes (#52)
- Added function for checking Google Compute instance (#57)
- Added JSON log formatter for GKE (#62)
- Added dynamic schema feature (#66)
- Changed Google API authentication helpers (#49)
- Improved date range derivation logic (#61)
- Changed Cloud Storage directory names and structure (#68)
- Changed supported schema format (#75) (#77)
- Fixed deprecation warnings from BigQuery Python SDK (#52)
- Fixed silenced Read the Docs build errors (#55)
0.1.0 - 2020-03-09
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size konekuta-0.6.0-py3-none-any.whl (23.8 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size konekuta-0.6.0.tar.gz (21.6 kB)||File type Source||Python version None||Upload date||Hashes View|