Skip to main content

PySpark-like DataFrame API in Rust (Polars backend), with Python bindings via PyO3

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

robin-sparkless (Python)

PyPI version Python 3.8+ License: MIT Documentation Source

PySpark-style DataFrames in Python—no JVM. Uses Polars under the hood for fast execution.

Install

pip install robin-sparkless

Requirements: Python 3.8+

Quick start

import robin_sparkless as rs

spark = rs.SparkSession.builder().app_name("demo").get_or_create()
df = spark.create_dataframe(
    [(1, 25, "Alice"), (2, 30, "Bob"), (3, 35, "Charlie")],
    ["id", "age", "name"],
)
filtered = df.filter(rs.col("age").gt(rs.lit(26)))
print(filtered.collect())
# [{"id": 2, "age": 30, "name": "Bob"}, {"id": 3, "age": 35, "name": "Charlie"}]

Read from files:

df = spark.read_csv("data.csv")
df = spark.read_parquet("data.parquet")
df = spark.read_json("data.json")

Filter, select, group, join, and use window functions with a PySpark-like API. See the full documentation for details.

Optional features (install from source)

Building from source requires Rust and maturin. Clone the repo, then:

pip install maturin
maturin develop --features pyo3           # default: DataFrame API
maturin develop --features "pyo3,sql"      # spark.sql() and temp views
maturin develop --features "pyo3,delta"    # read_delta / write_delta
maturin develop --features "pyo3,sql,delta" # all optional features

Type checking

The package ships with PEP 561 type stubs (robin_sparkless.pyi). Use mypy, pyright, or another checker:

pip install robin-sparkless mypy
mypy your_script.py

For Python 3.8 compatibility, use mypy <1.10 (newer mypy drops support for python_version = "3.8" in config). The project’s pyproject.toml includes [tool.mypy] and [tool.ruff] with target-version / python_version set for 3.8.

Development

From a clone of the repo:

# Full CI-like check (Rust + Python lint + Python tests)
make check-full

Or step by step:

python -m venv .venv
source .venv/bin/activate   # or .venv\Scripts\activate on Windows
pip install maturin pytest
maturin develop --features "pyo3,sql,delta"
pytest tests/python/ -v

Python lint and type-check (run by make check-full):

pip install ruff 'mypy>=1.4,<1.10'
ruff format --check .
ruff check .
mypy .

CI uses the same tooling: ruff, mypy<1.10 (Python 3.8), and pytest. PySpark is not required for tests (parity expectations are predetermined).

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

robin_sparkless-0.3.0-cp38-abi3-win_arm64.whl (14.4 MB view details)

Uploaded CPython 3.8+Windows ARM64

robin_sparkless-0.3.0-cp38-abi3-win_amd64.whl (16.0 MB view details)

Uploaded CPython 3.8+Windows x86-64

robin_sparkless-0.3.0-cp38-abi3-musllinux_1_2_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ x86-64

robin_sparkless-0.3.0-cp38-abi3-musllinux_1_2_aarch64.whl (13.7 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARM64

robin_sparkless-0.3.0-cp38-abi3-manylinux_2_28_aarch64.whl (14.0 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

robin_sparkless-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

robin_sparkless-0.3.0-cp38-abi3-macosx_11_0_arm64.whl (15.8 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

robin_sparkless-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-win_arm64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 d05636274cbb42b5858ae48037a68c7ba51fddbab48a2f27bec6130e05dc88fd
MD5 bebd6f4829c2d83fa7f2fa498ffe3a22
BLAKE2b-256 d44ac772c7dd14f03c7783ae9d284e479cbf76a0ba109b1e482181f90ab0a085

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 67ad85158bc0457a8453cc21a504d1b393dec807518e2cbf4631e7dd713d6372
MD5 3f5839b7d0b6d4ec8c989780e76939a3
BLAKE2b-256 79acbba0208a919e01af3ce140abd2c3250c2a226953d619dcf46be9a0df9fef

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 ebb7dcaa33edbf15f6093afe8c7871b389cb94b4544fba76d9d152a32ebdda7c
MD5 e56f0fc3d22440c489873b43e94c7da0
BLAKE2b-256 1caa23949f60f2f7baa37159d0bb914aa88061ad404bfd4c47ab9c144ba5792f

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 4938d99c65d1bb7b99fb4c4e7a449715be0e6ce2af9a93d0554aa1acff99e9ba
MD5 d67707a5b2ec5b5c2af27dbc888c3f8c
BLAKE2b-256 31c8331ef677ed61522b3e0ab0074573633d25a3058e3113f29d2f3257867f51

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 76e44d6739712b5c186e65a58b826ba7632b08bdbd75e16eab92d436f49983b1
MD5 65eb4f1453e2e4da9ab35d783602628d
BLAKE2b-256 b5225a21ed4df7919c6691da3b22472e9f9a99bddd10afb1e4d1789c4bbe0777

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cde6ae843ac20671140ea1bb58038525a9f7e2555eddb89221ab3c2f3ba79a46
MD5 5841595cc714f2a7b329dc911518b449
BLAKE2b-256 12d397b8f1c82421ac6eb8cfde844575bf32faf98e756e5222e787c3fefdfe01

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a7f7af9a255eb47118a22f33300d88405938007a78278a4358cd881a54a8c52b
MD5 3fa819cbd1a3befa87507e955fb2a73d
BLAKE2b-256 5cc6a9b8ec0edfac088029e9486cdb5c5a2ebed04ac82c458916ac0617777b74

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 447edd63b598615f4e816d8eb8baceddd519c2c65b4b9e9f9df97c8ccedb074b
MD5 6295c8a790e055f69ab3e4d7f5ae532b
BLAKE2b-256 d2b566e2043e1b4d1ed664b9869b41f7c6c2fbc03cb6748566617dfbb3e7e108

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page