A compatibility layer for the pyspark API, allowing you to run pyspark code on backends such as DuckDB and Polars without porting your code.
Project description
PySpark Dubber
A compatibility layer for the pyspark API, allowing you to run pyspark code on backends
such as DuckDB and Polars without porting your code.
Why
Lately, SQL engines and DataFrame libraries such as DuckDB and Polars have become popular, offering great performance for non-distributed analytical workflows up to relatively large datasets (tens of GBs). For these sizes and below, Spark adds a lot of overhead and its startup time is relatively slow, making it not very cost- and time-efficient.
However, Spark is still the most mature and widely used data processing framework, meaning that many people and organizations have large codebases relying on its APIs.
pyspark-dubber is a library that allows you to run pyspark code on many backends, such as
DuckDB and Polars (actually any backend supported by ibis at this time),
making it possible to migrate old code to a new backend with minimal changes.
The aspiration of pyspark-dubber is be bug-for-bug compatible with pyspark.
Documentation
You can find API documentation and more information about the project in our documentation page.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyspark_dubber-0.2.4.tar.gz.
File metadata
- Download URL: pyspark_dubber-0.2.4.tar.gz
- Upload date:
- Size: 7.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c07f2672a9e8263eead08dec77eb4336a8860b755f0adc0fc7713258de2d73cc
|
|
| MD5 |
a56654a91ac796a509b261b7c4b29cc4
|
|
| BLAKE2b-256 |
773fe8d2d190c3e2479807926292e34ab4edf0b12d4a574d11dd2486f8bb74ea
|
Provenance
The following attestation bundles were made for pyspark_dubber-0.2.4.tar.gz:
Publisher:
tests-and-pypi.yml on frapa/pyspark-dubber
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyspark_dubber-0.2.4.tar.gz -
Subject digest:
c07f2672a9e8263eead08dec77eb4336a8860b755f0adc0fc7713258de2d73cc - Sigstore transparency entry: 804926957
- Sigstore integration time:
-
Permalink:
frapa/pyspark-dubber@ed3dfc326597e1d855ae7dcd9669c784ed731bbc -
Branch / Tag:
refs/tags/0.2.4 - Owner: https://github.com/frapa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
tests-and-pypi.yml@ed3dfc326597e1d855ae7dcd9669c784ed731bbc -
Trigger Event:
push
-
Statement type:
File details
Details for the file pyspark_dubber-0.2.4-py3-none-any.whl.
File metadata
- Download URL: pyspark_dubber-0.2.4-py3-none-any.whl
- Upload date:
- Size: 36.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6dd4ad61c6d69b0432cde73c49b4bf0da26cd74dcd27f647c4e73a35d539e3f7
|
|
| MD5 |
46065e1bc5cab949924ae649318788ef
|
|
| BLAKE2b-256 |
e2c76fd585cc639ea32de786c951faa8ddea8ba721480aa7d17a3746ec7cbad1
|
Provenance
The following attestation bundles were made for pyspark_dubber-0.2.4-py3-none-any.whl:
Publisher:
tests-and-pypi.yml on frapa/pyspark-dubber
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyspark_dubber-0.2.4-py3-none-any.whl -
Subject digest:
6dd4ad61c6d69b0432cde73c49b4bf0da26cd74dcd27f647c4e73a35d539e3f7 - Sigstore transparency entry: 804926964
- Sigstore integration time:
-
Permalink:
frapa/pyspark-dubber@ed3dfc326597e1d855ae7dcd9669c784ed731bbc -
Branch / Tag:
refs/tags/0.2.4 - Owner: https://github.com/frapa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
tests-and-pypi.yml@ed3dfc326597e1d855ae7dcd9669c784ed731bbc -
Trigger Event:
push
-
Statement type: