No project description provided
Project description
Introduction
This repository was created to provide reusable tools for validating "double-runs" in the context of the Cockpit migration project, but also for other refactoring projects (MLOps, LACI, etc.).
Getting Started
Inside the src/doublerun/
folder, you will find two scripts:
pandas.py
: Contains functions for checking that 2pandas
DataFrames
are equal or spot their differences.spark.py
: Contains functions for checking that 2spark
DataFrames
are equal or spot their differences.
Inside the examples
folder, you will find notebooks detailing how these tools could be used.
Contributing
In order to contribute, create your branch with a meaningful title representing a feature you would like to develop (Examples: pandas_visualisation_mismatches
, pandas_high_perf_dask
, spark_notebooks
, etc.). Please, have a look at existing branches before creating a new one.
Then, make a pull request to the dev
branch to make sure no conflicts are created when we will be merging multiple branches together.
When writing a new function or modifying someone else's, feel free to add your name to the docstring so that people can contact you for help. Example:
def some_function(df1, df2):
"""
This functions does this and that.
args :
df1, df2 -> DataFrames to do stuff on.
authors:
Pierre Adeikalam : pierre.adeikalam@axa-direct.com (Creator)
John Doe : john.doe@axa-direct.com (Contributor)
"""
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for doublerun-0.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75419cb431e36bdf31c5ffa97b488ea4989f34accfbf00890c0726608d46c977 |
|
MD5 | 9ca340aa0ebb426f05484fac26adfab3 |
|
BLAKE2b-256 | d8676ae1884d3bbc38d7cc2086b77c56879fd3618d86c13de6db763cb727408a |