Skip to main content

A suite of pyspark, pandas and general pipeline utils for Reproducible Data Science and Analysis projects.

Project description

rdsa-utils

A suite of pyspark, pandas and general pipeline utils for Reproducible Data Science and Analysis (RDSA) projects.

The RDSA team sits within the Economic Statistics Change Directorate, and uses cutting-edge data science and engineering skills to produce the next generation of economic statistics. Current priorities include overhauling legacy systems and developing new systems for key statistics.

rdsa-utils is a Python codebase built with Python 3.8 and higher, and uses setup.py for dependency management and packaging.

Prerequisites

  • Python 3.8 or higher

Documentation and Further Information

Our documentation is automatically generated using GitHub Actions and MkDocs. For an in-depth understanding of rdsa-utils, how to contribute to rdsa-utils, and more, please refer to our MkDocs-generated documentation.

Licence

Unless stated otherwise, the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation.

The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdsa-utils-0.1.8.tar.gz (41.3 kB view hashes)

Uploaded Source

Built Distribution

rdsa_utils-0.1.8-py3-none-any.whl (46.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page