PySpark-like DataFrame API in Rust (Polars backend), with Python bindings via PyO3
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
Robin Sparkless
PySpark-style DataFrames in Rust—no JVM. A DataFrame library that mirrors PySpark’s API and semantics while using Polars as the execution engine.
Why Robin Sparkless?
- Familiar API —
SparkSession,DataFrame,Column, and PySpark-like functions so you can reuse patterns without the JVM. - Polars under the hood — Fast, native Rust execution with Polars for IO, expressions, and aggregations.
- Rust-first, Python optional — Use it as a Rust library or build the Python extension via PyO3 for a drop-in style API.
- Sparkless backend target — Designed to power Sparkless (the Python PySpark replacement) so Sparkless can run on this engine via PyO3.
Features
| Area | What’s included |
|---|---|
| Core | SparkSession, DataFrame, Column; filter, select, with_column, order_by, group_by, joins |
| IO | CSV, Parquet, JSON via SparkSession::read_* |
| Expressions | col(), lit(), when/then/otherwise, coalesce, cast, type/conditional helpers |
| Aggregates | count, sum, avg, min, max, and more; multi-column groupBy |
| Window | row_number, rank, dense_rank, lag, lead, first_value, last_value, and others with .over() |
| Arrays & maps | array_*, explode, create_map, map_keys, map_values, and related functions |
| Strings & JSON | String functions (upper, lower, substring, regexp_*, etc.), get_json_object, from_json, to_json |
| Datetime & math | Date/time extractors and arithmetic, year/month/day, math (sin, cos, sqrt, pow, …) |
| Optional SQL | spark.sql("SELECT ...") with temp views (createOrReplaceTempView, table) — enable with --features sql |
| Optional Delta | read_delta, read_delta_with_version, write_delta — enable with --features delta |
Known differences from PySpark are documented in docs/PYSPARK_DIFFERENCES.md. Parity status and roadmap are in docs/PARITY_STATUS.md and docs/ROADMAP.md.
Installation
Rust
Add to your Cargo.toml:
[dependencies]
robin-sparkless = "0.1.0"
Optional features:
robin-sparkless = { version = "0.1.0", features = ["sql"] } # spark.sql(), temp views
robin-sparkless = { version = "0.1.0", features = ["delta"] } # Delta Lake read/write
Python (PyO3)
Build the Python extension with maturin (Rust + Python 3.8+):
pip install maturin
maturin develop --features pyo3
# With optional SQL and/or Delta:
maturin develop --features "pyo3,sql"
maturin develop --features "pyo3,delta"
maturin develop --features "pyo3,sql,delta"
Then use the robin_sparkless module; see docs/PYTHON_API.md.
Quick start
Rust
use robin_sparkless::{col, lit_i64, SparkSession};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let spark = SparkSession::builder().app_name("demo").get_or_create();
// Create a DataFrame from rows (id, age, name)
let df = spark.create_dataframe(
vec![
(1, 25, "Alice".to_string()),
(2, 30, "Bob".to_string()),
(3, 35, "Charlie".to_string()),
],
vec!["id", "age", "name"],
)?;
// Filter and show
let adults = df.filter(col("age").gt(lit_i64(26)))?;
adults.show(Some(10))?;
Ok(())
}
You can also wrap an existing Polars DataFrame with DataFrame::from_polars(polars_df). See docs/QUICKSTART.md for joins, window functions, and more.
Python
import robin_sparkless as rs
spark = rs.SparkSession.builder().app_name("demo").get_or_create()
df = spark.create_dataframe([(1, 25, "Alice"), (2, 30, "Bob")], ["id", "age", "name"])
filtered = df.filter(rs.col("age").gt(rs.lit(26)))
print(filtered.collect()) # [{"id": 2, "age": 30, "name": "Bob"}]
Development
Prerequisites: Rust (see rust-toolchain.toml), and for Python tests: Python 3.8+, maturin, pytest.
| Command | Description |
|---|---|
cargo build |
Build (Rust only) |
cargo build --features pyo3 |
Build with Python extension |
cargo test |
Run Rust tests |
make test |
Run Rust + Python tests (creates venv, maturin develop, pytest) |
make check |
Format, clippy, audit, deny, tests |
cargo bench |
Benchmarks (robin-sparkless vs Polars) |
cargo doc --open |
Build and open API docs |
CI runs the same checks on push/PR (see .github/workflows/ci.yml).
Documentation
- Full documentation (Read the Docs) — Quickstart, Python API, reference, and Sparkless integration (MkDocs)
- API reference (docs.rs) — Crate API
- QUICKSTART — Build, usage, optional features, benchmarks
- ROADMAP — Development roadmap and Sparkless integration
- PYSPARK_DIFFERENCES — Known divergences from PySpark
- RELEASING — Releasing and publishing to crates.io
See also CHANGELOG.md for version history.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robin_sparkless-0.1.0.tar.gz.
File metadata
- Download URL: robin_sparkless-0.1.0.tar.gz
- Upload date:
- Size: 365.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf93903c9eb8b9a1a884c6e9c3a8aa7594283ea819b4304c6ca5924f9e7a9419
|
|
| MD5 |
4381da59d7364faf8b802425cef549d4
|
|
| BLAKE2b-256 |
315aa9da912366aec7e92a52ec32fb7343ea669a696e9e5fc7e5920a8f1aee8f
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-win_arm64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-win_arm64.whl
- Upload date:
- Size: 14.2 MB
- Tags: CPython 3.8+, Windows ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
036031245f545df746fe0c67e5daa15550e1a24b01251fc152e0e502264a3df2
|
|
| MD5 |
18435a0bd9a9036905dbdabdc162f596
|
|
| BLAKE2b-256 |
07e01a7fe75ab18d826333e9d345cc43e1b52ac9177819e141f8a271a57edc24
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 15.8 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f7b5a0f05ef0eeb15942fc5f37129f77f5ba9fb9cc97a6291a22f023f67a90d
|
|
| MD5 |
bfdb240f8610030cfec8fcbdc7a9da26
|
|
| BLAKE2b-256 |
7efd8d15c1d82ce15a24c470133098d917a088fb5b7bd5b1dbe6be49bb8ac2ae
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 18.4 MB
- Tags: CPython 3.8+, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bd7af41d94893cbe3352f24afbd1bd120dc60cabc82c8b240ff49a96f591c09
|
|
| MD5 |
af6e52c2e1492f990ca0e85430fd2e28
|
|
| BLAKE2b-256 |
c22cee81941e19099ea7b272870afd506a5e13f74c7180f4681010b3b1bb257e
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-musllinux_1_2_aarch64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-musllinux_1_2_aarch64.whl
- Upload date:
- Size: 16.9 MB
- Tags: CPython 3.8+, musllinux: musl 1.2+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94bc2ecc718c61bef99b89222632b4f02d5ee8201e714d96a87f04d8a1389f09
|
|
| MD5 |
536de72e9afeda0abe257e6d0dcb2893
|
|
| BLAKE2b-256 |
212bae282636259ec99b11fdc2fbdc04cf3a371ab76d87f150015ae0a549e554
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 13.8 MB
- Tags: CPython 3.8+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3eeca6ce74692e95d1916ee9e6411b82088d1444c26b380ea0634f00a07f232e
|
|
| MD5 |
74bc79518f27c686e87ff917255263a2
|
|
| BLAKE2b-256 |
28129d2469239298932509e4bd5b764b2e784679583c6e2fcbefa6bbfa24f0f8
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 14.8 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
559db2f5d16b81c4fd9660b94cac57b0e369dedcd44f77f771fc038be5f4b0dd
|
|
| MD5 |
c13f4875445bd60f7e787f2756b7bc52
|
|
| BLAKE2b-256 |
c2db10bd37f1c0a305108ce14685f5eb03a95bd47bc170d8c58d96b5801d9ed3
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 13.3 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
865ed7418c2b61e052db1ad329d37c45642684bed6c238a61b77794838456f73
|
|
| MD5 |
184701aa493a69bfb4a0eb9e1fe49a4b
|
|
| BLAKE2b-256 |
d6ca11fcb75abccc90e0efa0cbd4424526b9a0d0bafa14ebd0ffa5609381dda2
|
File details
Details for the file robin_sparkless-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: robin_sparkless-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 14.3 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
132f680a2ea258c70eb75e9622a1fe4094ace8a21a21373e8712bdc5b9f475cc
|
|
| MD5 |
2c19fc1d60e2f9bba9bd59ac73b91c45
|
|
| BLAKE2b-256 |
3df0d301bf16ca7358b788f0b71ec397ab0dfe2f6eb77686508f9ff06761f933
|