Skip to main content

PySpark-like DataFrame API in Rust (Polars backend), with Python bindings via PyO3

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

robin-sparkless (Python)

PyPI version Python 3.8+ License: MIT Documentation Source

PySpark-style DataFrames in Python—no JVM. Uses Polars under the hood for fast execution.

Install

pip install robin-sparkless

Requirements: Python 3.8+

Quick start

import robin_sparkless as rs

spark = rs.SparkSession.builder().app_name("demo").get_or_create()
df = spark.create_dataframe(
    [(1, 25, "Alice"), (2, 30, "Bob"), (3, 35, "Charlie")],
    ["id", "age", "name"],
)
filtered = df.filter(rs.col("age").gt(rs.lit(26)))
print(filtered.collect())
# [{"id": 2, "age": 30, "name": "Bob"}, {"id": 3, "age": 35, "name": "Charlie"}]

Read from files:

df = spark.read_csv("data.csv")
df = spark.read_parquet("data.parquet")
df = spark.read_json("data.json")

Filter, select, group, join, and use window functions with a PySpark-like API. See the full documentation for details.

Optional features (install from source)

Building from source requires Rust and maturin. Clone the repo, then:

pip install maturin
maturin develop --features pyo3           # default: DataFrame API
maturin develop --features "pyo3,sql"      # spark.sql() and temp views
maturin develop --features "pyo3,delta"    # read_delta / write_delta
maturin develop --features "pyo3,sql,delta" # all optional features

Type checking

The package ships with PEP 561 type stubs (robin_sparkless.pyi). Use mypy, pyright, or another checker:

pip install robin-sparkless mypy
mypy your_script.py

For Python 3.8 compatibility, use mypy <1.10 (newer mypy drops support for python_version = "3.8" in config). The project’s pyproject.toml includes [tool.mypy] and [tool.ruff] with target-version / python_version set for 3.8.

Development

From a clone of the repo:

# Full CI-like check (Rust + Python lint + Python tests)
make check-full

Or step by step:

python -m venv .venv
source .venv/bin/activate   # or .venv\Scripts\activate on Windows
pip install maturin pytest
maturin develop --features "pyo3,sql,delta"
pytest tests/python/ -v

Python lint and type-check (run by make check-full):

pip install ruff 'mypy>=1.4,<1.10'
ruff format --check .
ruff check .
mypy .

CI uses the same tooling: ruff, mypy<1.10 (Python 3.8), and pytest. PySpark is not required for tests (parity expectations are predetermined).

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

robin_sparkless-0.4.0-cp38-abi3-win_arm64.whl (14.4 MB view details)

Uploaded CPython 3.8+Windows ARM64

robin_sparkless-0.4.0-cp38-abi3-win_amd64.whl (16.0 MB view details)

Uploaded CPython 3.8+Windows x86-64

robin_sparkless-0.4.0-cp38-abi3-musllinux_1_2_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ x86-64

robin_sparkless-0.4.0-cp38-abi3-musllinux_1_2_aarch64.whl (13.7 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARM64

robin_sparkless-0.4.0-cp38-abi3-manylinux_2_28_aarch64.whl (14.0 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

robin_sparkless-0.4.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

robin_sparkless-0.4.0-cp38-abi3-macosx_11_0_arm64.whl (15.9 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

robin_sparkless-0.4.0-cp38-abi3-macosx_10_12_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-win_arm64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 e52879eff9838d5cac3c6144d6e88336e86898b816598d84fb6044cdacbb838f
MD5 7e0346f134f0487d561385fcd260ae1d
BLAKE2b-256 6902f6e91062acf1403381b2a9b6bb1fb3b3e9c556e9986c4ead2cb12630022f

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c27af0c7a79f66c4eda3d25aad783e0bd5dd8b39d32c49bc0178e758571e3735
MD5 b5a3b595a4be64980d88a3d1d869b968
BLAKE2b-256 3b37499b03f2657873b5ca032402188e26b42b9e9dbc6e1d9f539f248518b416

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 f774c2e53113d9ee6e47d803c72157e2d00e699e80da22889f197da3978c445c
MD5 6b7c16ca187d9db154fa16da665913b6
BLAKE2b-256 a71c8a28020c58c2154b4ba55f88796870312bb793e64b46405bc582638e43bf

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 e6395e23b9e1c7438f99d3602fb8313d690fa370677b7a9c430b640b22da2299
MD5 e51f071149af96789afd04c7b87e2d73
BLAKE2b-256 97284bdaf6a55d31a801e4d0b545a528267ebd41c5ac8a8da04c2acfb54f5c7c

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e06cd8e135abb4779f0f3f8b42586596b96dce9fc7a5b756aaa625f72cb78bf4
MD5 0e00980391054363f120696fdf2140bf
BLAKE2b-256 74e0103a4c44c20939f17f5feeff676670e2a6b725e92c04cfb33207e74ece85

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b615e0166e0c4ad542764a63a4fb64738c36bcc7a9fd2a8c2992ca65b9fa34eb
MD5 9d8c6397fe791af1e8039b7d0c4489ac
BLAKE2b-256 d2124e6ed32979862dcad94c1fb68ed8a1c24d33282557656e1b0d0b7b5e7aed

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8de989480dc0a9924c22bdd35b4afd8d4ab1e27018b3b311921c6fde5650c6f4
MD5 c2b408e057ef2574e56802e33f44407a
BLAKE2b-256 11b9c26f581d98a128e7b93a5144c6f27902262e0b2ae5e3a127e8e6b0814e9c

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.4.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.4.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 49aec0c08c69dd268f896e05f2bcbe3a7161012187675ddae2ab10e2362eff70
MD5 8d40a7796e40e651a0746becb1926a9a
BLAKE2b-256 576df6669ea8689cc20ccc3e60173f154e883decbacc1f046dedce71b409272c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page