PySpark-like DataFrame API in Rust (Polars backend), with Python bindings via PyO3
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
robin-sparkless (Python)
PySpark-style DataFrames in Python—no JVM. Uses Polars under the hood for fast execution.
Install
pip install robin-sparkless
Requirements: Python 3.8+
Quick start
import robin_sparkless as rs
spark = rs.SparkSession.builder().app_name("demo").get_or_create()
df = spark.create_dataframe(
[(1, 25, "Alice"), (2, 30, "Bob"), (3, 35, "Charlie")],
["id", "age", "name"],
)
filtered = df.filter(rs.col("age").gt(rs.lit(26)))
print(filtered.collect())
# [{"id": 2, "age": 30, "name": "Bob"}, {"id": 3, "age": 35, "name": "Charlie"}]
Read from files:
df = spark.read_csv("data.csv")
df = spark.read_parquet("data.parquet")
df = spark.read_json("data.json")
Filter, select, group, join, and use window functions with a PySpark-like API. See the full documentation for details.
UDFs and pandas_udf (Python)
- Scalar Python UDFs:
spark.udf().register("name", f, return_type=...)andcall_udf("name", col("x")), or use the returnedUserDefinedFunctiondirectly inwith_column/select. - Vectorized Python UDFs:
spark.udf().register("name", f, return_type=..., vectorized=True)for column-wise batch UDFs (one output per input row) inwith_column/select. - Grouped vectorized UDFs (GROUPED_AGG):
@rs.pandas_udf("double", function_type="grouped_agg")for per-group aggregations ingroup_by().agg([...]), returning one value per group.
See docs/UDF_GUIDE.md (or the “UDF guide” section in the online docs) for full details, semantics, and limitations.
Optional features (install from source)
Building from source requires Rust and maturin. Clone the repo, then:
pip install maturin
maturin develop --features pyo3 # default: DataFrame API
maturin develop --features "pyo3,sql" # spark.sql(), temp views, saveAsTable (in-memory tables), catalog.listTables/dropTable, read_delta(name)
maturin develop --features "pyo3,delta" # read_delta / write_delta (path I/O)
maturin develop --features "pyo3,sql,delta" # all optional features
Type checking
The package ships with PEP 561 type stubs (robin_sparkless.pyi). Use mypy, pyright, or another checker:
pip install robin-sparkless mypy
mypy your_script.py
For Python 3.8 compatibility, use mypy <1.10 (newer mypy drops support for python_version = "3.8" in config). The project’s pyproject.toml includes [tool.mypy] and [tool.ruff] with target-version / python_version set for 3.8.
Development
From a clone of the repo:
# Full CI-like check (Rust + Python lint + Python tests)
make check-full
Or step by step:
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install maturin pytest
maturin develop --features "pyo3,sql,delta"
pytest tests/python/ -v
Python lint and type-check (run by make check-full):
pip install ruff 'mypy>=1.4,<1.10'
ruff format --check .
ruff check .
mypy .
CI uses the same tooling: ruff, mypy<1.10 (Python 3.8), and pytest. PySpark is not required for tests (parity expectations are predetermined).
Links
- Documentation: robin-sparkless.readthedocs.io
- Source: github.com/eddiethedean/robin-sparkless
- Rust crate: crates.io/crates/robin-sparkless
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-win_arm64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-win_arm64.whl
- Upload date:
- Size: 14.5 MB
- Tags: CPython 3.8+, Windows ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d548efda7a7a0f18815c819c1863a44b3605ad30007b6d4046f6cd46abfdc98
|
|
| MD5 |
0b8975d182f26003b29bd762c8a2b115
|
|
| BLAKE2b-256 |
85d6ffc31d0a5a2865d6349816eecbab0c63221027e3707f15f85cf5d90a3b5d
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 16.1 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7830336434e320e348ffff82be78709c17dcf00351c4311eca91ae3e5b8a93e5
|
|
| MD5 |
7112a8f607ffa469673266a136852fa5
|
|
| BLAKE2b-256 |
31489e9c0221a77bd77c2d99ff8efd9d12b8d5424b98cb51a9d38634ad6c531f
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 15.1 MB
- Tags: CPython 3.8+, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7481e8f53bfd8ad76e9a988267171621c6e999e2f84542b5f8bfb497195d168
|
|
| MD5 |
f2de654739c93aaa26875c4420bcbd06
|
|
| BLAKE2b-256 |
5645fc07761abcf4cf3470ae663041369030548c819153b2d792b91a7e1e2bbc
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-musllinux_1_2_aarch64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-musllinux_1_2_aarch64.whl
- Upload date:
- Size: 13.8 MB
- Tags: CPython 3.8+, musllinux: musl 1.2+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9820d4c6bce93d7636570044239eb10860f7d38b20a859866985ac69a5d1a343
|
|
| MD5 |
c36836ea6ed73813fe74188c5e7130b5
|
|
| BLAKE2b-256 |
b55ab1e01a4a7eeed5de8d1a99b808b0a82d358854c3d20e207b9efc0366392a
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 14.1 MB
- Tags: CPython 3.8+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a8d7040f655ee1ed0b0b383d46bc91ae2a935d4542eef7cd02c6777fdc17b0f
|
|
| MD5 |
e97984424b0b413d9286bcd7e04c6c65
|
|
| BLAKE2b-256 |
9f1bb0760ab6d41dc2bd62d848cf1314042faa4291b6749d3b2cd805a0cf56e6
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 15.1 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c41e865d81684547df1985ea2fce234a015611371c47cc67c0ce18b69d395262
|
|
| MD5 |
3d413fa0a53a03bb46e9805278802557
|
|
| BLAKE2b-256 |
952731a84a09a06c8d29e1e2036d95866f29c6554a5c979cb8a76c1329eb55b4
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 16.0 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a48ba1664515875e6b56de9160430a9b2ff23e54e887bdecb2ff8c32ca237a71
|
|
| MD5 |
782bd4f75a59cf17e466b6f65af92f4f
|
|
| BLAKE2b-256 |
b27ad418b3060c0d1be539efe4859ca27508f7ac06eab400b71d2823a449e475
|
File details
Details for the file robin_sparkless-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: robin_sparkless-0.5.0-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 16.8 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8fe504cde225ed8424d344caf9ec291d30309d97bf674dbe54fe63eb66939f99
|
|
| MD5 |
056231a7f720fce16fd006b03f8416ee
|
|
| BLAKE2b-256 |
69803526097c5fc78dfa903afaa7120f012c3cbd742ac615665ceaf1921c70ac
|