A suite of pyspark, pandas and general pipeline utils for Reproducible Data Science and Analysis projects.
Project description
rdsa-utils
A suite of pyspark, pandas and general pipeline utils for Reproducible Data Science and Analysis (RDSA) projects.
The RDSA team sits within the Economic Statistics Change Directorate, and uses cutting-edge data science and engineering skills to produce the next generation of economic statistics. Current priorities include overhauling legacy systems and developing new systems for key statistics.
rdsa-utils
is a Python codebase built with Python 3.8 and higher, and uses setup.py
for dependency management and packaging.
Prerequisites
- Python 3.8 or higher
Documentation and Further Information
Our documentation is automatically generated using GitHub Actions and MkDocs. For an in-depth understanding of rdsa-utils
, how to contribute to rdsa-utils
, and more, please refer to our MkDocs-generated documentation.
Licence
Unless stated otherwise, the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation.
The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rdsa_utils-0.1.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69f220a7605fb5ab5850450c7a3a5ab49e475e3ba1e305316521616092f87252 |
|
MD5 | 9f2570c88d8223afbf5e3d6e03634b53 |
|
BLAKE2b-256 | b9e6fcbae75759b23d13a8c43faafeadb7a91b6244c0ebf0e32aaf7971ff0ea6 |