A Python class that supports Data Science projects.
Project description
resumableds
A Python class that supports Data Science projects.
resumableds supports you in writing data science scripts including save/resume functionality.
Data can be saved and resumed avoiding unnessary retrievals of raw data from data storages.
The data directory structure is inspired by cookiecutter-data-science (https://drivendata.github.io/cookiecutter-data-science/).
The class also supports the statement 'Analysis is a DAG' (https://drivendata.github.io/cookiecutter-data-science/#analysis-is-a-dag).
resumableds is written in pure Python and it is intended to be used within Jupyter notebooks. It however can also be useful in Python scripts or script pipelines.
Example
proj1 = RdsProject('project1') # create object from class (creates the dir if it doesn't exist yet)
proj1.raw.df1 = pd.DataFrame() # create dataframe as attribute of proj1.raw (RdsFs 'raw')
proj1.defs.variable1 = 'foo' # create simple objects as attribute of proj1.defs (RdsFs 'defs')
proj1.save() # saved attributes of all RfdFs in proj1 to disk
This will result in the following directory structure (plus some overhead of internals):
- <output_dir>/defs/var_variable1.pkl
- <output_dir>/raw/df1.pkl
- <output_dir>/raw/df1.csv
Note, pandas dataframes are always dumped as pickle for further processing and as csv for easy exploration. The csv files are never read back anymore.
Later on or in another python session, you can do this:
proj2 = RdsProject('project1') # vars and data are read back to their original names
proj2.defs.variable1 == 'foo' # ==> True
isinstance(proj2.raw.df1, pd.DataFrame) # ==> True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file resumableds-1.0.0.tar.gz.
File metadata
- Download URL: resumableds-1.0.0.tar.gz
- Upload date:
- Size: 15.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50fa37de478da40a3429e440f5dd3bc7a58bbef456705ede0904035f60621413
|
|
| MD5 |
6e03e4bc5a5164bae452a221448cdf4c
|
|
| BLAKE2b-256 |
672edff19507eba62e9f48333f49acb5dee536b8e9728e90923e871d2ff71997
|
File details
Details for the file resumableds-1.0.0-py3-none-any.whl.
File metadata
- Download URL: resumableds-1.0.0-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2915838a4545cebef2eaf4ab8385e9d9db1b84528387a611caea6dac119ce0ee
|
|
| MD5 |
0a8535107a8e6c6cc357858195ca0fb0
|
|
| BLAKE2b-256 |
b84c0d19e94b6a53e55e700dcc4a36f3eb7523e4f51121f93fc79de9245085f1
|