Skip to main content

A lightweight caching library for polars

Project description

polars_cache

A lightweight, lazy, disc-based cache for Polars LazyFrames.

Usage

import polars as pl
import polars_cache as pc

lf = pl.LazyFrame({"x" : range(100)})

def very_expensive_compuation(col: str):
    pl.col(col).pow(2).exp().sqrt()

query = (
    lf
    .with_columns(very_expensive_compuation("x"))
    .pipe(pc.cache_to_disc, max_age=120) # set up cache
)

df1 = query.collect()  # populate the cache
df2 = query.collect()  # second invocation will be much faster!

# do some downstream computation
another_query = query.with_columns(y = pl.col("x") + 7)

df3 = another_query.collect() # this will use the cache!

Updating a source will cause the cache to refresh:

import os

query_from_a_file = (
    pl.scan_parquet("data.parquet")
    .group_by("age", "sex")
    .agg(pl.len())
    .pipe(pc.cache_to_disc, check_sources=True)
)

_ = query_from_a_file.collect() # populate cache
result = query_from_a_file.collect() # load from cache

os.utime("data.parquet")  # update source timestamp
new_result = query_from_a_file.collect() # cache is invalid -- will refresh

⚠️ Warning ⚠️

This function is opaque to the Polars optimizer and will split your query into two chunks: one before the cache statment and one after. Each query will be independently optimzed by Polars, but optimizations (e.g. projection and predicate pushdown) will NOT be able to cross the cache barrier. Use with caution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_cache-1.0.0.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_cache-1.0.0-py3-none-any.whl (3.6 kB view details)

Uploaded Python 3

File details

Details for the file polars_cache-1.0.0.tar.gz.

File metadata

  • Download URL: polars_cache-1.0.0.tar.gz
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for polars_cache-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a6c35a74b272bb2fc3dd180c9e28421bd4105d5d62cbb05c62309826ba9ba3e5
MD5 5e20b957fc4a1b30c0858018184bd1bb
BLAKE2b-256 3740ca6e536f5667d2ac4eb808225ebd956c8e45a31f12beca3ef90eaf70b80b

See more details on using hashes here.

File details

Details for the file polars_cache-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for polars_cache-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b6b5319631b21b57bce2cd48026ca654c135414c14a8686d966855657b141cea
MD5 3326e4f69d1cb287edcd15b06caa0caf
BLAKE2b-256 2fd648517e45d0626d486e8f152d65ad641f4a75bb11eb2d1888a0b4cec8acb0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page