Skip to main content

PySpark-like DataFrame API in Rust (Polars backend), with Python bindings via PyO3

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

robin-sparkless (Python)

PyPI version Python 3.8+ License: MIT Documentation Source

PySpark-style DataFrames in Python—no JVM. Uses Polars under the hood for fast execution.

Install

pip install robin-sparkless

Requirements: Python 3.8+

Quick start

import robin_sparkless as rs

spark = rs.SparkSession.builder().app_name("demo").get_or_create()
df = spark.create_dataframe(
    [(1, 25, "Alice"), (2, 30, "Bob"), (3, 35, "Charlie")],
    ["id", "age", "name"],
)
filtered = df.filter(rs.col("age").gt(rs.lit(26)))
print(filtered.collect())
# [{"id": 2, "age": 30, "name": "Bob"}, {"id": 3, "age": 35, "name": "Charlie"}]

Read from files:

df = spark.read_csv("data.csv")
df = spark.read_parquet("data.parquet")
df = spark.read_json("data.json")

Filter, select, group, join, and use window functions with a PySpark-like API. See the full documentation for details.

Optional features (install from source)

Building from source requires Rust and maturin. Clone the repo, then:

pip install maturin
maturin develop --features pyo3           # default: DataFrame API
maturin develop --features "pyo3,sql"      # spark.sql() and temp views
maturin develop --features "pyo3,delta"    # read_delta / write_delta
maturin develop --features "pyo3,sql,delta" # all optional features

Type checking

The package ships with PEP 561 type stubs (robin_sparkless.pyi). Use mypy, pyright, or another checker:

pip install robin-sparkless mypy
mypy your_script.py

For Python 3.8 compatibility, use mypy <1.10 (newer mypy drops support for python_version = "3.8" in config). The project’s pyproject.toml includes [tool.mypy] and [tool.ruff] with target-version / python_version set for 3.8.

Development

From a clone of the repo:

# Full CI-like check (Rust + Python lint + Python tests)
make check-full

Or step by step:

python -m venv .venv
source .venv/bin/activate   # or .venv\Scripts\activate on Windows
pip install maturin pytest
maturin develop --features "pyo3,sql,delta"
pytest tests/python/ -v

Python lint and type-check (run by make check-full):

pip install ruff 'mypy>=1.4,<1.10'
ruff format --check .
ruff check .
mypy .

CI uses the same tooling: ruff, mypy<1.10 (Python 3.8), and pytest. PySpark is not required for tests (parity expectations are predetermined).

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

robin_sparkless-0.2.0-cp38-abi3-win_arm64.whl (14.3 MB view details)

Uploaded CPython 3.8+Windows ARM64

robin_sparkless-0.2.0-cp38-abi3-win_amd64.whl (15.9 MB view details)

Uploaded CPython 3.8+Windows x86-64

robin_sparkless-0.2.0-cp38-abi3-musllinux_1_2_x86_64.whl (14.9 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ x86-64

robin_sparkless-0.2.0-cp38-abi3-musllinux_1_2_aarch64.whl (13.6 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARM64

robin_sparkless-0.2.0-cp38-abi3-manylinux_2_28_aarch64.whl (13.9 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

robin_sparkless-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.9 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

robin_sparkless-0.2.0-cp38-abi3-macosx_11_0_arm64.whl (15.7 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

robin_sparkless-0.2.0-cp38-abi3-macosx_10_12_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-win_arm64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 b63a5bcf44373c352459f9d55119de07f27d700801909b3ebed82e8bd4b215bb
MD5 0f9984ad5855394a6fe5f34c40f194df
BLAKE2b-256 2b482ed2f3010184e01a2fb0e62c054a2a878837dd4ff37c0af889992ad8d245

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 f7a8852d319be47b72e4003806d3e72016b67eeeef7fd31cb9a270cb67684e8d
MD5 c081653d9629f1c6058e737a08b5dac8
BLAKE2b-256 ef793faaab97ad1ef76605894ee602a530244563e736ce3db3eec7e06e91edce

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 fb85134b01327f00f652362b6c9148ae1aa2f2b02d5d697e77821bc88d7f79a3
MD5 8deb2ebefa54042089883783ae44e17d
BLAKE2b-256 d171b51bcf490b2fdaef3b13277fc6ce9c880e77b4674662f61056a49b714614

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 4599511dc8ba3a15b92bcf04989a7784300ea9f385882c798af7782383a6e144
MD5 f0a8f4c6df6f1c79ddcefe0dc26337a9
BLAKE2b-256 613eef214512871d778f6698f93552b2e83ff2b5e0a54a828bfa69a83c38ba3f

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f7b6e0645914610710aff65aacb639c6316c7de30dc3d72d348119eba8444bfc
MD5 17569f9ec0fec5377f067f572262dcb0
BLAKE2b-256 828e5d80aae0d10722594f9295d0801ce4b8ab17f8fce826a91e37f7cd51a9da

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 23b50944b03d2b3c3f791b9b6ea1d6558de68101f823e626eaa7f5326a7e33ea
MD5 3f7b9db78a16ead48abaa670e632f7a1
BLAKE2b-256 9ee972b75feaeb82d30b994e5594b9d12ecb24f1c817eb39431d0113fd0c9c83

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bf152e417fdf5202d3a57f505c7b89d26054f3f9132dff853355fcdd8d2207fa
MD5 dd020a6456d4a3f5c05b8723100abb69
BLAKE2b-256 167cdf7d23a405e35da3a849bbf2ded9aa4a6343079b0861f90948856f01b1fa

See more details on using hashes here.

File details

Details for the file robin_sparkless-0.2.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for robin_sparkless-0.2.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a8dbece8cd1245b707064930ed52a57aedb4e457a4a41433c8bff671b7c621f3
MD5 28ca890d4c29251f60932a6c1a5f29c5
BLAKE2b-256 71d5771fcdbc61af4f396edd989d54f8b27336d1e030590993187311fbb6c2f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page