Persist expensive operations on disk.
Project description
Installation
pip install .
By default, a folder called .persist_to_disk
is created under your home directory, and will be used to store cache files.
If you want to change it, see "Global Settings" below.
Global Settings
To set global settings (for example, where the cache should go by default), please do the following:
import persist_to_disk as ptd
ptd.config.generate_config()
Then, you could change the setting persist_path
parameter in there.
Example
Using persist_to_disk
is very easy.
@ptd.persistf()
def train_a_model(dataset, model_cls, lr, epochs):
...
return trained_model_or_key
Note that ptd.persistf
can be used with multiprocessing directly.
If target function (e.g. train_a_model
) is not gonna be pickled by such pipelines, you could use persist
:
@ptd.persist()
def _train_a_model(dataset, model_cls, lr, epochs):
...
return trained_model_or_key
def train_a_model(*args, **kwargs):
trained_model_or_key = _train_a_model(*args, **kwargs)
... # Do more stuff
return trained_model_or_key
persist
and persistf
take the same arguments.
For example, if you want to group the cache folder by dataset (so you can manage them easier manually), and your function takes some dictionary as input (which is not hashable), you could do:
@ptd.persistf(groupby=['dataset'], expand_dict_kwargs=['model_kwargs'])
def train_a_model(dataset, model_cls, model_kwargs, lr, epochs):
...
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for persist_to_disk-0.0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f7ccd2ac81547468eb96cb084ceeae99eded1cb435c4645ca41daf35c75e5b4 |
|
MD5 | 71f03476b5222a7c3e55250177f3588e |
|
BLAKE2b-256 | 59736d9f69c5c21f6e4db58a0d8fb0fb87d0767aee6c86c0ee243c2fbc04230f |