Skip to main content

A lightweight caching library for polars

Project description

polars_cache

A lightweight, lazy, disc-based cache for Polars LazyFrames.

Usage

import polars as pl
import polars_cache as pc

lf = pl.LazyFrame({"x" : range(100)})

def very_expensive(col: str):
    pl.col(col).pow(2).exp().sqrt()

query = (
    lf
    .with_columns(very_expensive("x"))
    .pipe(pc.cache_to_disc, max_age=120) # set up cache
)

df1 = query.collect()  # populate the cache
df2 = query.collect()  # second invocation will be much faster!

# do some downstream computation
another_query = query.with_columns(y = pl.col("x") + 7)

df3 = another_query.collect() # this will use the cache!

Updating a source will cause the cache to refresh:

import os

query_from_a_file = (
    pl.scan_parquet("data.parquet")
    .group_by("age", "sex")
    .agg(pl.len())
    .pipe(pc.cache_to_disc, check_sources=True)
)

_ = query_from_a_file.collect() # populate cache
result = query_from_a_file.collect() # load from cache

os.utime("data.parquet")  # update source timestamp
new_result = query_from_a_file.collect() # cache is invalid -- will refresh

⚠️ Warning ⚠️

This function is opaque to the Polars optimizer and will split your query into two chunks: one before the cache statment and one after. Each query will be independently optimzed by Polars, but optimizations (e.g. projection and predicate pushdown) will NOT be able to cross the cache barrier. Use with caution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_cache-1.0.3.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_cache-1.0.3-py3-none-any.whl (3.6 kB view details)

Uploaded Python 3

File details

Details for the file polars_cache-1.0.3.tar.gz.

File metadata

  • Download URL: polars_cache-1.0.3.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for polars_cache-1.0.3.tar.gz
Algorithm Hash digest
SHA256 a0cc048dc832ef9d99c023e0260792a5d97755d9b965f86ee54a7d2170bbe6e9
MD5 7b1017389649b468f24ff6d6eaaacd92
BLAKE2b-256 19b4e8e76e468a0f1482e92d8fa5103bd48da8ead50c3b5c30a39d5355e7e2b2

See more details on using hashes here.

File details

Details for the file polars_cache-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for polars_cache-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 547ec084fb4c93c5ba67a59d0b6d4bf97bdf71d2bb14e99c8d4aa0ca36768249
MD5 eb2c14eb2736a89b9d95cb7fdde4ebf4
BLAKE2b-256 26793bebc6a6f8e2689e912c1e0392703e66324c90cfa3442702a8d8ea4897c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page