Execution and caching tool for python
Project description
Exca - ⚔
Execute and cache seamlessly in python.
Quick install
pip install exca
Full documentation
Documentation is available at https://facebookresearch.github.io/exca/
Basic overview
exca provides simple decorators to:
- execute a (hierarchy of) computation(s) either locally or on distant nodes,
- cache the result.
The problem:
In ML pipelines, the use of a simple python function, such as my_task:
import numpy as np
def my_task(param: int = 12) -> float:
return param * np.random.rand()
often requires cumbersome overheads to (1) configure the parameters, (2) submit the job on a cluster, (3) cache the results: e.g.
import pickle
from pathlib import Path
import submitit
# Configure
param = 12
# Check task has already been executed
filepath = tmp_path / f'result-{param}.npy'
if not filepath.exists():
# Submit job on cluster
executor = submitit.AutoExecutor(cluster=None, folder=tmp_path)
job = executor.submit(my_task, param)
result = job.result()
# Cache result
with filepath.open("wb") as f:
pickle.dump(result, f)
These overheads lead to several issues, such as debugging, handling hierarchical execution and properly saving the results (ending in the classic 'result-parm12-v2_final_FIX.npy').
The solution:
exca can be used to decorate the method of a pydantic model so as to seamlessly configure its execution and caching:
import numpy as np
import pydantic
import exca as xk
class MyTask(pydantic.BaseModel):
param: int = 12
infra: xk.TaskInfra = xk.TaskInfra()
@infra.apply
def process(self) -> float:
return self.param * np.random.rand()
task = MyTask(param=1, infra={"folder": tmp_path, "cluster": "auto"})
out = task.process() # runs on slurm if available
# calling process again will load the cache and not a new random number
assert out == task.process()
See the API reference for all the details
Quick comparison
| feature \ tool | lru_cache | hydra | submitit | exca |
|---|---|---|---|---|
| RAM cache | ✔ | ✔ | ||
| file cache | ✔ | |||
| remote compute | ✔ | ✔ | ✔ | |
| pure python (vs command line) | ✔ | ✔ | ✔ | |
| hierarchical config | ✔ | ✔ |
Contributing
See the CONTRIBUTING file for how to help out.
Citing
@misc{exca,
author = {J. Rapin and J.-R. King},
title = {{Exca - Execution and caching}},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/facebookresearch/exca}},
}
License
exca is MIT licensed, as found in the LICENSE file.
Also check-out Meta Open Source Terms of Use and Privacy Policy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file exca-0.5.22.tar.gz.
File metadata
- Download URL: exca-0.5.22.tar.gz
- Upload date:
- Size: 136.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7d3fe33d74f9e786812b95279984f43de2702ec98abd17875f78a12385d4d6c
|
|
| MD5 |
cd0e5a0681889511e2c241842feddb5d
|
|
| BLAKE2b-256 |
9d3ef9532b775a69a6103da4b38af6db85b3bc707908e31e394ca2352b2fa2dd
|
File details
Details for the file exca-0.5.22-py3-none-any.whl.
File metadata
- Download URL: exca-0.5.22-py3-none-any.whl
- Upload date:
- Size: 163.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10ae722d10c26ad0db7cf453ce7f292f2af199718163c4da085ecdbafd733120
|
|
| MD5 |
86525c93f48400f242b93a9972a914db
|
|
| BLAKE2b-256 |
caef41824f527b9ebc210a1a31128fc00896f30ffc63cf1a3ac78722f55031a3
|