yProv4DV (Data Visualization) is a python utility which allows for packaging of code, inputs and outputs of data visualization scripts. Once integrated, it will produce a zip file which includes all information necessary for reproducibility of the current script, including a copy of the files used.
Project description
yProv4DV
A python utility for automatically packaging code, inputs and outputs of data visualization scripts.
Explore the docs »
Report Bug
·
Request Feature
yProv4DV
yProv4DV (Data Visualization) is a python utility which allows for packaging of code, inputs and outputs of data visualization scripts. Once integrated, it will produce a zip file which includes all information necessary for reproducibility of the current script, including a copy of the files used. This library is part of the yProv framework, which means it can also produce W3C-prov compliant files useful for interpretability and reproducibility.
Installation
pip install yprov4dv
Current Compatibility
Currently, the yProv4DV library is able to track input files which are opened by the following libraries:
- pandas (read_csv, read_parquet, read_excel, read_json)
- xarray (open_dataset, open_mfdataset)
- geopandas (read_file)
- numpy (load)
- torch (load)
- rasterio (open)
- As well as the standard python calls (such as open())
Additionally, if data is plotted using:
- matplotlib (plot, bar, ...)
- seaborn (scatterplot, lineplot, barplot, histplot, boxplot)
Then the subset of data used only for visualization can be saved in an isolated file (by setting the
save_input_files_subsetoption toTrue).
Any type of output files generated during the execution of the program will also be logged, indipendently of file type.
Example
Inside the examples folder is contained an example of a simple data visualization script in python. It is already integrated with the yProv4DV library, and can be run with the prompt:
python ./examples/simple.py
This execution will create:
- The
provdirectory (which is customizable) and will hold all the information for the current execution, soinputs,outputsand source code (src), all in their respective folders. Additionally, in the same directory, the library creates a set of provenance files, containing a description of the current execution (in.json,dotandsvgformats). prov.zip: containining all the aforementioned information in a zipped RO-Crate.
Parameters
To keep the number of yprov4dv calls to a minimum, the library exposes just three directives:
def start_run(*args)def log_input(path_to_untracked_file)def log_output(path_to_untracked_file)
The behaviour of yProv4DV can be changed passing parameters to the start_run function.
All possible fields are listed below:
provenance_directory: (str) changes where the inputs, outputs and code directory are stored;prefix: (str) changes the prefix given to fields in the provenance document;run_name: (str) changes the run name inside the provenance file;create_json_file: (TrueorFalse) whether the json file is created or not;create_dot_file: (TrueorFalse) whether the dot file is created or not, cannot beTrueifYPROV4DV_CREATE_JSON_FILEisFalse;create_svg_file: (TrueorFalse) whether the svg file is created or not, cannot beTrueifYPROV4DV_CREATE_JSON_FILEorYPROV4DV_CREATE_DOT_FILEareFalse;create_rocrate: (TrueorFalse) whether the ro-crate zip is created or not;default_namespace: (str) changes the default namespace inside the provenance filesave_input_files_full: (str) decides whether input files are saved in fullsave_input_files_subset: (str) decides whether inputs are saved as a subset (only the plotted data)skip_files_larger_than: (int) In Mb, files larger than the threshold will not be copied;verbose: (TrueorFalse),
For an example, run:
python ./examples/customized.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yprov4dv-1.3.0.tar.gz.
File metadata
- Download URL: yprov4dv-1.3.0.tar.gz
- Upload date:
- Size: 25.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c5cbfc821dde5c8223b0dbcc03369a00f05ea91cbdc5338eb44b395f0e97f15
|
|
| MD5 |
be0bd40c443d3214f456c88b96f6bf9b
|
|
| BLAKE2b-256 |
f86a9947cc3c8456f9726d77df29c97ce661739c3461205c28093057698227c8
|
File details
Details for the file yprov4dv-1.3.0-py3-none-any.whl.
File metadata
- Download URL: yprov4dv-1.3.0-py3-none-any.whl
- Upload date:
- Size: 24.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95c3750ca07f370bc1bae84dc20ebf2eb776c95ad1ac157bc2cd71c9d5d3be8e
|
|
| MD5 |
d707f7e9d7f4f7bdee0d554d6ef56036
|
|
| BLAKE2b-256 |
4652ea076e46d2613e2465001a39983315bb2a96413168d8357da7479d2e84a9
|