Skip to main content

Pure Rust SPSS .sav/.zsav reader with Polars DataFrame output

Project description

ambers

ambers banner

Crates.io PyPI License: MIT

Pure Rust SPSS .sav/.zsav reader — Arrow-native, zero C dependencies.

Features

  • Read .sav (bytecode) and .zsav (zlib) files
  • Arrow RecordBatch output — zero-copy to Polars, DataFusion, DuckDB
  • Rich metadata: variable labels, value labels, missing values, MR sets, measure levels
  • Lazy reader via scan_sav() — returns Polars LazyFrame with projection and row limit pushdown
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer
  • 4–8x faster than pyreadstat, on par with polars_readstat
  • Python + Rust dual API from a single crate

Installation

Python:

pip install ambers

Rust:

cargo add ambers

Quick Start

Python

import ambers as am

# Eager read — data + metadata
df, meta = am.read_sav("survey.sav")

# Lazy read — returns Polars LazyFrame
lf, meta = am.scan_sav("survey.sav")
df = lf.select(["Q1", "Q2", "age"]).head(1000).collect()

# Explore metadata
meta.summary()
meta.describe("Q1")
meta.value("Q1")

# Read metadata only (fast, skips data)
meta = am.read_sav_metadata("survey.sav")

Rust

use ambers::{read_sav, read_sav_metadata};

// Read data + metadata
let (batch, meta) = read_sav("survey.sav")?;
println!("{} rows, {} cols", batch.num_rows(), meta.number_columns);

// Read metadata only
let meta = read_sav_metadata("survey.sav")?;
println!("{}", meta.label("Q1").unwrap_or("(no label)"));

Metadata API (Python)

Method Description
meta.summary() Formatted overview: file info, type distribution, annotations
meta.describe("Q1") Deep-dive into a single variable (or list of variables)
meta.diff(other) Compare two metadata objects, returns MetaDiff
meta.label("Q1") Variable label
meta.value("Q1") Value labels dict
meta.format("Q1") SPSS format string (e.g. "F8.2", "A50")
meta.measure("Q1") Measurement level ("nominal", "ordinal", "scale")
meta.schema Full metadata as a nested Python dict

All variable-name methods raise KeyError for unknown variables.

Streaming Reader (Rust)

let mut scanner = ambers::scan_sav("survey.sav")?;
scanner.select(&["age", "gender"])?;
scanner.limit(1000);

while let Some(batch) = scanner.next_batch()? {
    println!("Batch: {} rows", batch.num_rows());
}

Performance

Eager Read

All results return a Polars DataFrame. Average of 5 runs on Windows 11, Python 3.13, 24-core machine.

File Size Rows Cols ambers polars_readstat pyreadstat pyreadstat mp (4w) ambers vs polars_readstat ambers vs pyreadstat
test_1 (bytecode) 0.2 MB 1,500 75 0.002s 0.012s 0.010s 0.409s 6.4x faster 5.5x faster
test_2 (bytecode) 147 MB 22,070 677 1.119s 1.091s 4.351s 1.773s ~tied 3.9x faster
test_3 (uncompressed) 1.1 GB 79,066 915 1.713s 1.532s 6.390s 2.635s ~tied 3.7x faster
test_4 (uncompressed) 0.6 MB 201 158 0.015s 0.023s 0.020s 0.424s 1.6x faster 1.3x faster
test_5 (uncompressed) 0.6 MB 203 136 0.003s 0.013s 0.014s 0.417s 4.3x faster 4.8x faster
  • vs pyreadstat: 4–6x faster across all file sizes
  • vs pyreadstat multiprocess (4 workers): ambers single-threaded still faster on every file
  • vs polars_readstat: tied on large files, 2–6x faster on small/medium files (lower startup overhead)
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer

pyreadstat multiprocess returns pandas; timing includes pl.from_pandas() conversion.

Lazy Read with Pushdown

scan_sav() returns a Polars LazyFrame. Unlike eager reads, it only reads the data you ask for:

File (size) Full collect Select 5 cols Head 1000 rows Select 5 + head 1000
test_2 (147 MB, 22K × 677) 1.282s 0.478s (2.7x) 0.205s (6.3x) 0.160s (8.0x)
test_3 (1.1 GB, 79K × 915) 1.668s 0.822s (2.0x) 0.031s (53.5x) 0.022s (75.8x)

On the 1.1 GB file, selecting 5 columns and 1000 rows completes in 22ms — 76x faster than reading the full dataset.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ambers-0.1.6.tar.gz (66.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ambers-0.1.6-cp314-cp314-win_amd64.whl (748.1 kB view details)

Uploaded CPython 3.14Windows x86-64

ambers-0.1.6-cp314-cp314-manylinux_2_34_x86_64.whl (890.9 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

ambers-0.1.6-cp314-cp314-macosx_11_0_arm64.whl (816.3 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ambers-0.1.6-cp313-cp313-win_amd64.whl (750.3 kB view details)

Uploaded CPython 3.13Windows x86-64

ambers-0.1.6-cp313-cp313-manylinux_2_34_x86_64.whl (891.4 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

ambers-0.1.6-cp313-cp313-macosx_11_0_arm64.whl (816.1 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ambers-0.1.6-cp312-cp312-win_amd64.whl (751.0 kB view details)

Uploaded CPython 3.12Windows x86-64

ambers-0.1.6-cp312-cp312-manylinux_2_34_x86_64.whl (892.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

ambers-0.1.6-cp312-cp312-macosx_11_0_arm64.whl (815.9 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file ambers-0.1.6.tar.gz.

File metadata

  • Download URL: ambers-0.1.6.tar.gz
  • Upload date:
  • Size: 66.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.6.tar.gz
Algorithm Hash digest
SHA256 53feb1d136a3148b6757323eee26e83432717f1ee0515da6465c31ecc9e36852
MD5 d123e6220e093f4623a4ec43250bbace
BLAKE2b-256 7560b51f64e414870752dae66189226644ae27e4133eb108d740f392f3fe50ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6.tar.gz:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: ambers-0.1.6-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 748.1 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.6-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 e34662658cf73dd6a20a238d25be7449925713f9a81b590d5e53cb8f37d32a5e
MD5 0d96443d5ba109168b17d1a06b7f60bb
BLAKE2b-256 6b694fd24e68ff300ceb8ff2189a1722039c64122efe486bf6f27db67bb39762

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp314-cp314-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.1.6-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 36cd30f41f059285081e8a919e3fbefc05be1b0d2efec0edc3815fe2bb031a3b
MD5 3289bff3ccd96037abd452f486578a12
BLAKE2b-256 2cacf618f9b57263a20a8fe3efb1dcad923ce36cf2d23dd814330550c81bed84

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.1.6-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 db3888b2997529e76f696a7407c2e7b2ee68d1da3037d924203221e66f9f3512
MD5 bbe9e90fe11afa7e900d6102e4467642
BLAKE2b-256 80ff38c74d5e5feb0489f8793fcd71c0f44f0cd212edc9d802b2488999daed5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: ambers-0.1.6-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 750.3 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.6-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 804c4ef9bada6b809d70c546f1f9937b36ccf59f16a22f56c981a7059f8cfa4e
MD5 94a696ce402cb9975511a7f4b0c642be
BLAKE2b-256 a3171b93b078cd76b651da26dd4cb7da12c248e7f514a66b40d67950708e6ba2

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp313-cp313-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.1.6-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 31e3fa09fb96682d95e5379fbd337beefe3c7e81363ec3c05980504f5fe3e89c
MD5 4bc0be2af660d2bf45a4662bde8dd722
BLAKE2b-256 062d1933724ad3b2df713ac77394ea0432bcf62acb9cfa2098ed024480f91a90

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.1.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9175f219e6d62c8cf3eeb36455566ed8d45589fc213f9cf08a4ca54875551f71
MD5 23801d1f67f133da740eadd27ba965d7
BLAKE2b-256 a2a267b8a70a3bab970ac2ff0f2d640c687d400778305627b28d538b99da062d

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: ambers-0.1.6-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 751.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4967b3043ec3ac1aa3d7f9d6f2130d2b764acf8b692d5bc180e347d0393cdb29
MD5 048358a0d577a9ab0a29100f8a34bb35
BLAKE2b-256 f2041cb2059e94c4e4ffff3ce60483209391e8311bc951ea7b3c5ac31bbf403b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp312-cp312-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.1.6-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f8df49fc91b1aa4af1e657bfa5115bf910c27bb85e6422ee6ac557781ee8308c
MD5 a1a573629ab6cf65fe8e79402d1d8108
BLAKE2b-256 bbeabf62bd1fbb04096918b8d3714f51530c3d2c08812e67e39e0dcaae97bcc5

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.1.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8ddb394c2372042d938b71e846298fa8f65523e095f77cf59bec2a73ef6adbd8
MD5 622ed6f7384ebed313fe8f553ab14396
BLAKE2b-256 484f6a6db8be0bc608f4a706f01454c474a17a12242e7d109109c57f0dc691e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.6-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page