Skip to main content

df-diskcache is a Python library for caching pandas.DataFrame objects to local disk.

Project description

Summary

df-diskcache is a Python library for caching pandas.DataFrame objects to local disk.

PyPI package version Supported Python versions CI status of Linux/macOS/Windows Test coverage: coveralls CodeQL

Installation

pip install df-diskcache

Features

Supports the following methods:

  • get: Get a cache entry (pandas.DataFrame) for the key. Returns None if the key is not found.

  • set: Create a cache entry with an optional time-to-live (TTL) for the key-value pair.

  • update

  • touch: Update the last accessed time of a cache entry to extend the TTL.

  • delete

  • prune: Delete expired cache entries.

  • Dictionary-like operations:
    • __getitem__

    • __setitem__

    • __contains__

    • __delitem__

Usage

Sample Code:
import pandas as pd
from dfdiskcache import DataFrameDiskCache

cache = DataFrameDiskCache()
url = "https://raw.githubusercontent.com/pandas-dev/pandas/v2.1.3/pandas/tests/io/data/csv/iris.csv"

df = cache.get(url)
if df is None:
    print("cache miss")
    df = pd.read_csv(url)
    cache.set(url, df)
else:
    print("cache hit")

print(df)

You can also use operations like a dictionary:

Sample Code:
import pandas as pd
from dfdiskcache import DataFrameDiskCache

cache = DataFrameDiskCache()
url = "https://raw.githubusercontent.com/pandas-dev/pandas/v2.1.3/pandas/tests/io/data/csv/iris.csv"

df = cache[url]
if df is None:
    print("cache miss")
    df = pd.read_csv(url)
    cache[url] = df
else:
    print("cache hit")

print(df)

Set TTL for cache entries

Sample Code:
import pandas as pd
from dfdiskcache import DataFrameDiskCache

DataFrameDiskCache.DEFAULT_TTL = 10  # you can override the default TTL (default: 3600 seconds)

cache = DataFrameDiskCache()
url = "https://raw.githubusercontent.com/pandas-dev/pandas/v2.1.3/pandas/tests/io/data/csv/iris.csv"

df = cache.get(url)
if df is None:
    df = pd.read_csv(url)
    cache.set(url, df, ttl=60)  # you can set a TTL for the key-value pair

print(df)

Dependencies

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

df-diskcache-0.0.2.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

df_diskcache-0.0.2-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file df-diskcache-0.0.2.tar.gz.

File metadata

  • Download URL: df-diskcache-0.0.2.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for df-diskcache-0.0.2.tar.gz
Algorithm Hash digest
SHA256 d050bf3f4eb4f1f141a2f0ac5fc160fbeb880098c13e411d5357ea48d2d36f0f
MD5 2b9e049211e1df05b3f04d3703b81fb7
BLAKE2b-256 b5c1a951201dbe93782f8c3b84a2b303240f05b4d6af3c6c3636d3aa9d8b183e

See more details on using hashes here.

File details

Details for the file df_diskcache-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for df_diskcache-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dcb6c9b1ea31d18aa51e26db23468cb1a5f97b37943172951812352138a106b8
MD5 7584273beae391cc4456ae12d55fedc1
BLAKE2b-256 5a0a471991d408b270b052b2b0aaf5734856cc65ff4177db57c8b32800b6634d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page