Cache pandas dataframes with a simple interface
Project description
Cache Pandas Dataframes to Disk
Easily cache Pandas Dataframes to disk using a simple interface.
Sample usage
from cache_df import CacheDF
import pandas as pd
cache = CacheDF(cache_dir='./caches')
# Caching a dataframe
df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
cache.cache(df, 'my_df')
# Checking if a dataframe is cached
df_is_cached = cache.is_cached('my_df')
# Reading a dataframe from cache
try:
df = cache.read('my_df')
df_selective_cols = cache.read('my_df', columns=['a']) # Read only a subset of columns
except FileNotFoundError:
print('Dataframe not cached')
# Deleting a dataframe from cache if it exists
cache.uncache('my_df')
# Clearing all cached dataframes
cache.clear()
Where it can be used
- It can be used when you are using a shared file system across multiple machines such as AWS EFS, GCP Filestore, Azure Files, etc.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cache_df-1.1.tar.gz
(2.9 kB
view details)
File details
Details for the file cache_df-1.1.tar.gz
.
File metadata
- Download URL: cache_df-1.1.tar.gz
- Upload date:
- Size: 2.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2925bfa5b4258eac306dc51e9bb77a869b6a64d7935ad3717d0754d96bcf4af |
|
MD5 | 7feef12ae47f51200572f24aad8ee53e |
|
BLAKE2b-256 | 1add3cd0b92d7950a7bf323ce622c30b3ea75a77a2cf2baf292aeaa0a9951e3e |