Skip to main content

Pure Rust SPSS .sav/.zsav reader with Polars DataFrame output

Project description

ambers

ambers banner

Crates.io PyPI License: MIT

Pure Rust SPSS .sav/.zsav reader — Arrow-native, zero C dependencies.

Features

  • Read .sav (bytecode) and .zsav (zlib) files
  • Arrow RecordBatch output — zero-copy to Polars, DataFusion, DuckDB
  • Rich metadata: variable labels, value labels, missing values, MR sets, measure levels
  • Lazy reader via scan_sav() — returns Polars LazyFrame with projection and row limit pushdown
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer
  • The fastest SPSS reader — up to 3x faster than polars_readstat, 10x faster than pyreadstat
  • Python + Rust dual API from a single crate

Installation

Python:

pip install ambers

Rust:

cargo add ambers

Quick Start

Python

import ambers as am

# Eager read — data + metadata
df, meta = am.read_sav("survey.sav")

# Lazy read — returns Polars LazyFrame
lf, meta = am.scan_sav("survey.sav")
df = lf.select(["Q1", "Q2", "age"]).head(1000).collect()

# Explore metadata
meta.summary()
meta.describe("Q1")
meta.value("Q1")

# Read metadata only (fast, skips data)
meta = am.read_sav_metadata("survey.sav")

Rust

use ambers::{read_sav, read_sav_metadata};

// Read data + metadata
let (batch, meta) = read_sav("survey.sav")?;
println!("{} rows, {} cols", batch.num_rows(), meta.number_columns);

// Read metadata only
let meta = read_sav_metadata("survey.sav")?;
println!("{}", meta.label("Q1").unwrap_or("(no label)"));

Metadata API (Python)

Method Description
meta.summary() Formatted overview: file info, type distribution, annotations
meta.describe("Q1") Deep-dive into a single variable (or list of variables)
meta.diff(other) Compare two metadata objects, returns MetaDiff
meta.label("Q1") Variable label
meta.value("Q1") Value labels dict
meta.format("Q1") SPSS format string (e.g. "F8.2", "A50")
meta.measure("Q1") Measurement level ("nominal", "ordinal", "scale")
meta.schema Full metadata as a nested Python dict

All variable-name methods raise KeyError for unknown variables.

Streaming Reader (Rust)

let mut scanner = ambers::scan_sav("survey.sav")?;
scanner.select(&["age", "gender"])?;
scanner.limit(1000);

while let Some(batch) = scanner.next_batch()? {
    println!("Batch: {} rows", batch.num_rows());
}

Performance

Eager Read

All results return a Polars DataFrame. Best of 3–5 runs (with warmup) on Windows 11, Python 3.13, 24-core machine.

File Size Rows Cols ambers polars_readstat pyreadstat vs prs vs pyreadstat
test_1 (bytecode) 0.2 MB 1,500 75 < 0.01s < 0.01s 0.011s
test_2 (bytecode) 147 MB 22,070 677 0.286s 0.897s 3.524s 3.1x 12x
test_3 (uncompressed) 1.1 GB 79,066 915 0.322s 1.150s 4.918s 3.6x 15x
test_4 (uncompressed) 0.6 MB 201 158 0.002s 0.003s 0.012s 1.5x 6x
test_5 (uncompressed) 0.6 MB 203 136 0.002s 0.003s 0.016s 1.5x 8x
test_6 (uncompressed) 5.4 GB 395,330 916 1.600s 1.752s 25.214s 1.1x 16x
  • Faster than polars_readstat on all tested files — 1.1–3.6x faster
  • 6–16x faster than pyreadstat across all file sizes
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer

Lazy Read with Pushdown

scan_sav() returns a Polars LazyFrame. Unlike eager reads, it only reads the data you ask for:

File (size) Full collect Select 5 cols Head 1000 rows Select 5 + head 1000
test_2 (147 MB, 22K × 677) 0.903s 0.363s (2.5x) 0.181s (5.0x) 0.157s (5.7x)
test_3 (1.1 GB, 79K × 915) 0.700s 0.554s (1.3x) 0.020s (35x) 0.012s (58x)
test_6 (5.4 GB, 395K × 916) 3.062s 2.343s (1.3x) 0.022s (139x) 0.013s (236x)

On the 5.4 GB file, selecting 5 columns and 1000 rows completes in 13ms — 236x faster than reading the full dataset.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ambers-0.2.6.tar.gz (72.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ambers-0.2.6-cp314-cp314-win_amd64.whl (844.6 kB view details)

Uploaded CPython 3.14Windows x86-64

ambers-0.2.6-cp314-cp314-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

ambers-0.2.6-cp314-cp314-macosx_11_0_arm64.whl (912.1 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ambers-0.2.6-cp313-cp313-win_amd64.whl (845.9 kB view details)

Uploaded CPython 3.13Windows x86-64

ambers-0.2.6-cp313-cp313-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

ambers-0.2.6-cp313-cp313-macosx_11_0_arm64.whl (911.9 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ambers-0.2.6-cp312-cp312-win_amd64.whl (846.4 kB view details)

Uploaded CPython 3.12Windows x86-64

ambers-0.2.6-cp312-cp312-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

ambers-0.2.6-cp312-cp312-macosx_11_0_arm64.whl (912.0 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file ambers-0.2.6.tar.gz.

File metadata

  • Download URL: ambers-0.2.6.tar.gz
  • Upload date:
  • Size: 72.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.6.tar.gz
Algorithm Hash digest
SHA256 6dae3673da0f89d5769adba442c5a1b5a98d5e9d7db364225c15b3e0d7b69bb3
MD5 4d309b108b31c63ba6661da23e948bf9
BLAKE2b-256 17a5bace6f222e58a3bb7e8b1f94577eba2290fdb9101b9b6848127ff3c239da

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6.tar.gz:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.6-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 844.6 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.6-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 8b83a236baf3487af243c46189336b32db7a1083b7eccdbd79c376f30e8abd1c
MD5 546e71ff4e2bf0c11212bc0b556fe17a
BLAKE2b-256 f93d0dbec91284282ace7e2587eec7688acd473f15d581a49aa27eb26533ae6a

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp314-cp314-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.6-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 382989a0a5c12a8f235156038bce4602620768699a2b2b58ae17101951c0a700
MD5 de83c59cf0a1cc7da73c84d52c03e837
BLAKE2b-256 1631cf039c3d188ae354fed19908f0b4b2ae5ab36559268c5157d7c4517b0977

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.6-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ea1c261ac7f4816120ae73f25878fc28d15d0873e8b597c680af9d83d723af2f
MD5 0c363498f450db1961daa4872e02f290
BLAKE2b-256 989789daa8683cb9ded17591ef3bf54458608601b0bbce8d91f252e98db7f60d

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.6-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 845.9 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.6-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 30aad55fd09083dfe5a536b4edf7dd018da23ba1d5af39470de8cb508ec2f4ed
MD5 859ad726e1a04ccf4899c1665a4d20dc
BLAKE2b-256 2c12b96fda17ef5acefb70e5ee3d5e1cbeda8a1755abe8baee716000635489d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp313-cp313-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.6-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 5a1ceb88f515eae69dc0e62628dadc04a89c452c4c0d9aceeaf427fe9138c385
MD5 692b393ea67daca5b8e278ccf831ee3c
BLAKE2b-256 56b030623a8de951dec7127de047a3e69749af0e8c88fae55064bf577f98eaf1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 54f9e4cbbff8ba662cc683a2ad8a71abc90c3937641a2644a6ac07d90f35869a
MD5 407dc6f2ad99bf344efbb83ee30a0616
BLAKE2b-256 0428a0d0510ba0c1d59529612604ca55419c1fed3ee8f1ac70b519c134827d30

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.6-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 846.4 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 bf0411c71ed1d7c19ec7b391f3457ed943d88b396229f0e2639e7cdb5fa09fff
MD5 1ad8fba5fcafc02acc450ef14c565aff
BLAKE2b-256 e89635c3bd650496a5799b1f37125dd7a920bf410a9733e4ab38bee9c6bf0645

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp312-cp312-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.6-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 b3f5f95f4977886f894b05303cf71c521b07e8bff5c4b6e95a4fe3002551084c
MD5 8c0d7e644f0bd0ab7605fe2f6892aef0
BLAKE2b-256 0f5d266119867cd07b8f9a0c0d10f3fcbfb69b0b011f3841bae4640d312cc76e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5ac92eee51c376a9b098dd5e665185ad4130def20ee6f2110779bee0bfa0e06e
MD5 6c78b8001fa886af5cef7c8bf45daa7b
BLAKE2b-256 977b114bad0c2e5e7ce3b363756d7c5456a3aa445d462c095556d2b4c34a76fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.6-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page