Cache ML -- layer on top of joblib to cache parsed datasets, dramatically reducing load time of large data files. Also supports encryption at rest.
Project description
Cache ML – layer on top of joblib to cache parsed datasets, dramatically reducing load time of large data files. Also supports encryption at rest. Currently supported backends are local filesystem and S3.
Example Usage
Here is an example from a Jupyter notebook:
import pandas as pd
from cacheml.cache import LocalFile, Cache
cache = Cache()
@cache.cache # this function's result will be cached
def read_and_filter_commits(commits_file_obj):
return pd.read_csv(commits_file_obj.path)
ts_all = read_and_filter_commits(LocalFile(commits.csv.gz))
Performance Test Results
There are from running the unit tests which simulate loading the time series data from datahut.ai, which is in a 216MB compressed csv file. The first case just loads into a dataframe, while the second case does some additional processing (sorting, removing entries outside a time range).
File location |
Time for raw df read |
Time for initial read and caching of file |
Time for cached read |
|---|---|---|---|
Local File |
134.0 |
130.9 |
0.41 |
S3 |
153.6 |
144.6 |
0.38 |
File location |
Time for original function |
Time for initial read and caching of file |
Time for cached read |
|---|---|---|---|
Local File |
139.6 |
142.49 |
1.04 |
S3 |
153.4 |
155.8 |
0.99 |
Copyright
Copyright 2021 by Benedat LLC. Available under the Apache 2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file CacheML-1.0.4.tar.gz.
File metadata
- Download URL: CacheML-1.0.4.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcde1402bd2cb546d85ff5fcf5957dcab53de443c942ca7f1eeb2c3fd5ee74ef
|
|
| MD5 |
8cbaa95fae00d46c2126b593c893d78c
|
|
| BLAKE2b-256 |
f91bdb28818d9f3963b57399001c03d237715556911074758e611f847b3182bb
|
File details
Details for the file CacheML-1.0.4-py3-none-any.whl.
File metadata
- Download URL: CacheML-1.0.4-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d42c16f0dbe88d758c08b24b11abc0164e6070ebb96dca4bd518aea1def765a7
|
|
| MD5 |
9e5c134727979501d7182a36bbe1ff13
|
|
| BLAKE2b-256 |
abcddd083bca88b8f0fe1b9db628313ce00eeffe0f02d2ae85b2ef9a5022214b
|