Skip to main content

Pure Rust SPSS .sav/.zsav reader with Polars DataFrame output

Project description

ambers

ambers banner

Crates.io PyPI License: MIT

Pure Rust SPSS .sav/.zsav reader — Arrow-native, zero C dependencies.

Features

  • Read .sav (bytecode) and .zsav (zlib) files
  • Arrow RecordBatch output — zero-copy to Polars, DataFusion, DuckDB
  • Rich metadata: variable labels, value labels, missing values, MR sets, measure levels
  • Lazy reader via scan_sav() — returns Polars LazyFrame with projection and row limit pushdown
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer
  • The fastest SPSS reader — up to 3x faster than polars_readstat, 10x faster than pyreadstat
  • Python + Rust dual API from a single crate

Installation

Python:

pip install ambers

Rust:

cargo add ambers

Quick Start

Python

import ambers as am

# Eager read — data + metadata
df, meta = am.read_sav("survey.sav")

# Lazy read — returns Polars LazyFrame
lf, meta = am.scan_sav("survey.sav")
df = lf.select(["Q1", "Q2", "age"]).head(1000).collect()

# Explore metadata
meta.summary()
meta.describe("Q1")
meta.value("Q1")

# Read metadata only (fast, skips data)
meta = am.read_sav_metadata("survey.sav")

Rust

use ambers::{read_sav, read_sav_metadata};

// Read data + metadata
let (batch, meta) = read_sav("survey.sav")?;
println!("{} rows, {} cols", batch.num_rows(), meta.number_columns);

// Read metadata only
let meta = read_sav_metadata("survey.sav")?;
println!("{}", meta.label("Q1").unwrap_or("(no label)"));

Metadata API (Python)

Method Description
meta.summary() Formatted overview: file info, type distribution, annotations
meta.describe("Q1") Deep-dive into a single variable (or list of variables)
meta.diff(other) Compare two metadata objects, returns MetaDiff
meta.label("Q1") Variable label
meta.value("Q1") Value labels dict
meta.format("Q1") SPSS format string (e.g. "F8.2", "A50")
meta.measure("Q1") Measurement level ("nominal", "ordinal", "scale")
meta.schema Full metadata as a nested Python dict

All variable-name methods raise KeyError for unknown variables.

Streaming Reader (Rust)

let mut scanner = ambers::scan_sav("survey.sav")?;
scanner.select(&["age", "gender"])?;
scanner.limit(1000);

while let Some(batch) = scanner.next_batch()? {
    println!("Batch: {} rows", batch.num_rows());
}

Performance

Eager Read

All results return a Polars DataFrame. Best of 3–5 runs (with warmup) on Windows 11, Python 3.13, 24-core machine.

File Size Rows Cols ambers polars_readstat pyreadstat vs prs vs pyreadstat
test_1 (bytecode) 0.2 MB 1,500 75 < 0.01s < 0.01s 0.011s
test_2 (bytecode) 147 MB 22,070 677 0.286s 0.897s 3.524s 3.1x 12x
test_3 (uncompressed) 1.1 GB 79,066 915 0.322s 1.150s 4.918s 3.6x 15x
test_4 (uncompressed) 0.6 MB 201 158 0.002s 0.003s 0.012s 1.5x 6x
test_5 (uncompressed) 0.6 MB 203 136 0.002s 0.003s 0.016s 1.5x 8x
test_6 (uncompressed) 5.4 GB 395,330 916 1.600s 1.752s 25.214s 1.1x 16x
  • Faster than polars_readstat on all tested files — 1.1–3.6x faster
  • 6–16x faster than pyreadstat across all file sizes
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer

Lazy Read with Pushdown

scan_sav() returns a Polars LazyFrame. Unlike eager reads, it only reads the data you ask for:

File (size) Full collect Select 5 cols Head 1000 rows Select 5 + head 1000
test_2 (147 MB, 22K × 677) 0.903s 0.363s (2.5x) 0.181s (5.0x) 0.157s (5.7x)
test_3 (1.1 GB, 79K × 915) 0.700s 0.554s (1.3x) 0.020s (35x) 0.012s (58x)
test_6 (5.4 GB, 395K × 916) 3.062s 2.343s (1.3x) 0.022s (139x) 0.013s (236x)

On the 5.4 GB file, selecting 5 columns and 1000 rows completes in 13ms — 236x faster than reading the full dataset.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ambers-0.2.5.tar.gz (69.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ambers-0.2.5-cp314-cp314-win_amd64.whl (844.7 kB view details)

Uploaded CPython 3.14Windows x86-64

ambers-0.2.5-cp314-cp314-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

ambers-0.2.5-cp314-cp314-macosx_11_0_arm64.whl (912.4 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ambers-0.2.5-cp313-cp313-win_amd64.whl (846.0 kB view details)

Uploaded CPython 3.13Windows x86-64

ambers-0.2.5-cp313-cp313-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

ambers-0.2.5-cp313-cp313-macosx_11_0_arm64.whl (912.2 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ambers-0.2.5-cp312-cp312-win_amd64.whl (846.4 kB view details)

Uploaded CPython 3.12Windows x86-64

ambers-0.2.5-cp312-cp312-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

ambers-0.2.5-cp312-cp312-macosx_11_0_arm64.whl (912.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file ambers-0.2.5.tar.gz.

File metadata

  • Download URL: ambers-0.2.5.tar.gz
  • Upload date:
  • Size: 69.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.5.tar.gz
Algorithm Hash digest
SHA256 2c879ef9bac9697812bae3b785ff116de05ce11dbf5c03eafa15c246f69c46dc
MD5 38dbc47f5a1d10a30d420e02429e680f
BLAKE2b-256 86a37606189c6ee23a9d012b708b70371bda7c2366c08b110afc3c6d43a221e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5.tar.gz:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.5-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 844.7 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.5-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 e0e259fada36ed631f1f3ffa479b3130fa7840ed20dd8e3c3a5100d4913efb95
MD5 360133f0e5985a41d2bd58d1f792277e
BLAKE2b-256 39d3ad040cb7f242d7568a22dede972ccfb3e062e7cda58c72f01c1eb59f0ccc

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp314-cp314-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.5-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 b08703cdce2888ed3e84cfca094aa49751df152d4b851612cb8ecb496edc6b98
MD5 e29419f86ef2a151c2c30cc933d7f786
BLAKE2b-256 65193aa1a607cafe7a5cd21aa89874ce20977de505218e9b69f59d8aa1bbeee9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.5-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a06afafdb1d3f2f23b85b57aeb8e75a8247252e5456dcd46682d58cd0a59026d
MD5 504c3caae792ab54101b5edc7306bb86
BLAKE2b-256 6921a5ef44d132244ed4e807afb6e439f85e6c4f2dce52c8e0997c32d7b2fa44

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.5-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 846.0 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.5-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 db5dc26e869aab02fb5f1c6ba1af590013eda323d4020e5cf9218063b09bf464
MD5 f4a349fe0cd24e4dc1fcc5c3c68bd039
BLAKE2b-256 7f4235bff3affa0668b8c4083e6edf3fedc4afd7764b6e3de411af4048c1dece

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp313-cp313-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.5-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 c928374951cf22d5cd826a0d69ab1f18e12b581b804bcabba9e21dfd5f7d98bd
MD5 e6377825d09b03578143790ad711b0bd
BLAKE2b-256 81b8bb421600fd1546ff6ad3f0b1caf184b1776a2621981a466d162243940746

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.5-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1aca4daa439e9136c0767e527400920bfd589d0e0f51effa945658e51381e4c3
MD5 228e7200c33441fa2f548db52a19a9dd
BLAKE2b-256 8ed30d36fa8c18caaeb020c851b0e2076d8fdc518f9d5b3e0a58f5239e11ea29

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.5-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 846.4 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.5-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 97841af37ad2204184916df5ca54a4a83dadc279c88490ce1a3c7664ef51a99a
MD5 68fa4e55d353f61f453a8458a23cf59e
BLAKE2b-256 42dd94950b2dcf3b8bc0c0929867108f19a1f076801034c5d9245536caa5e5b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp312-cp312-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.5-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 53ee5637d95f2c801600dd276d88f5560829bce041a9bea5c6789b3982da99e1
MD5 bfb3da6c3608d7afe71c863148020138
BLAKE2b-256 4363b9d49be9bbf591e4c17402d490390ef6f679531dd94fd9f3b4471d9b2ac2

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.5-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.5-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ba8f319295f7a1c8c07821a0a2588d22ccb86507bd5c98ab11998a15ed61b6f2
MD5 8af9d66e7ed361b23228701b393c8877
BLAKE2b-256 7fbb2a1adc99c2c9a268c25d80456e833fa2bf463bbfdd98c78424218c1f2046

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.5-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page