Skip to main content

Pure Rust SPSS .sav/.zsav reader with Polars DataFrame output

Project description

ambers

ambers banner

Crates.io PyPI License: MIT

Pure Rust SPSS .sav/.zsav reader — Arrow-native, zero C dependencies.

Features

  • Read .sav (bytecode) and .zsav (zlib) files
  • Arrow RecordBatch output — zero-copy to Polars, DataFusion, DuckDB
  • Rich metadata: variable labels, value labels, missing values, MR sets, measure levels
  • Lazy reader via scan_sav() — returns Polars LazyFrame with projection and row limit pushdown
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer
  • One of the fastest SPSS readers — up to 2.5x faster than polars_readstat, 5–10x faster than pyreadstat
  • Python + Rust dual API from a single crate

Installation

Python:

pip install ambers

Rust:

cargo add ambers

Quick Start

Python

import ambers as am

# Eager read — data + metadata
df, meta = am.read_sav("survey.sav")

# Lazy read — returns Polars LazyFrame
lf, meta = am.scan_sav("survey.sav")
df = lf.select(["Q1", "Q2", "age"]).head(1000).collect()

# Explore metadata
meta.summary()
meta.describe("Q1")
meta.value("Q1")

# Read metadata only (fast, skips data)
meta = am.read_sav_metadata("survey.sav")

Rust

use ambers::{read_sav, read_sav_metadata};

// Read data + metadata
let (batch, meta) = read_sav("survey.sav")?;
println!("{} rows, {} cols", batch.num_rows(), meta.number_columns);

// Read metadata only
let meta = read_sav_metadata("survey.sav")?;
println!("{}", meta.label("Q1").unwrap_or("(no label)"));

Metadata API (Python)

Method Description
meta.summary() Formatted overview: file info, type distribution, annotations
meta.describe("Q1") Deep-dive into a single variable (or list of variables)
meta.diff(other) Compare two metadata objects, returns MetaDiff
meta.label("Q1") Variable label
meta.value("Q1") Value labels dict
meta.format("Q1") SPSS format string (e.g. "F8.2", "A50")
meta.measure("Q1") Measurement level ("nominal", "ordinal", "scale")
meta.schema Full metadata as a nested Python dict

All variable-name methods raise KeyError for unknown variables.

Streaming Reader (Rust)

let mut scanner = ambers::scan_sav("survey.sav")?;
scanner.select(&["age", "gender"])?;
scanner.limit(1000);

while let Some(batch) = scanner.next_batch()? {
    println!("Batch: {} rows", batch.num_rows());
}

Performance

Eager Read

All results return a Polars DataFrame. Average of 5 runs on Windows 11, Python 3.13, 24-core machine.

File Size Rows Cols ambers polars_readstat ambers vs prs pyreadstat pyreadstat mp (4w) ambers vs pyreadstat
test_1 (bytecode) 0.2 MB 1,500 75 0.002s 0.004s 2.0x faster 0.010s 0.493s 5.0x faster
test_2 (bytecode) 147 MB 22,070 677 0.812s 0.991s 1.2x faster 3.564s 1.781s 4.4x faster
test_3 (uncompressed) 1.1 GB 79,066 915 0.509s 1.279s 2.5x faster 4.849s 2.764s 9.5x faster
test_4 (uncompressed) 0.6 MB 201 158 0.002s 0.004s 2.0x faster 0.018s 0.470s 9.0x faster
test_5 (uncompressed) 0.6 MB 203 136 0.002s 0.004s 2.0x faster 0.015s 0.454s 7.5x faster
test_6 (uncompressed) 5.4 GB 395,330 916 2.801s 1.809s 1.5x slower 24.199s 11.718s 8.6x faster
  • vs polars_readstat: faster on 5 of 6 files — 1.2–2.5x faster (test_6 at 5.4 GB is 1.5x slower)
  • vs pyreadstat: 4–10x faster across all file sizes
  • vs pyreadstat multiprocess (4 workers): ambers single-threaded still faster on every file
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer

pyreadstat multiprocess returns pandas; timing includes pl.from_pandas() conversion.

Lazy Read with Pushdown

scan_sav() returns a Polars LazyFrame. Unlike eager reads, it only reads the data you ask for:

File (size) Full collect Select 5 cols Head 1000 rows Select 5 + head 1000
test_2 (147 MB, 22K × 677) 0.903s 0.363s (2.5x) 0.181s (5.0x) 0.157s (5.7x)
test_3 (1.1 GB, 79K × 915) 0.700s 0.554s (1.3x) 0.020s (35x) 0.012s (58x)
test_6 (5.4 GB, 395K × 916) 3.062s 2.343s (1.3x) 0.022s (139x) 0.013s (236x)

On the 5.4 GB file, selecting 5 columns and 1000 rows completes in 13ms — 236x faster than reading the full dataset.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ambers-0.2.3.tar.gz (68.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ambers-0.2.3-cp314-cp314-win_amd64.whl (843.7 kB view details)

Uploaded CPython 3.14Windows x86-64

ambers-0.2.3-cp314-cp314-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

ambers-0.2.3-cp314-cp314-macosx_11_0_arm64.whl (911.0 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ambers-0.2.3-cp313-cp313-win_amd64.whl (845.0 kB view details)

Uploaded CPython 3.13Windows x86-64

ambers-0.2.3-cp313-cp313-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

ambers-0.2.3-cp313-cp313-macosx_11_0_arm64.whl (910.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ambers-0.2.3-cp312-cp312-win_amd64.whl (845.4 kB view details)

Uploaded CPython 3.12Windows x86-64

ambers-0.2.3-cp312-cp312-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

ambers-0.2.3-cp312-cp312-macosx_11_0_arm64.whl (910.8 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file ambers-0.2.3.tar.gz.

File metadata

  • Download URL: ambers-0.2.3.tar.gz
  • Upload date:
  • Size: 68.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.3.tar.gz
Algorithm Hash digest
SHA256 1292b08c3a6b4b118f833d2d6c5eb1d19894162a1d48a9c5af960a751c69b614
MD5 1a1d1b617552bb64ad389b571e86e195
BLAKE2b-256 5e03f5a401e08908589b522f8ff13c6ca478786de023085c29defa909fa1719b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3.tar.gz:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.3-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 843.7 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.3-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 07e7345eedd8af8c8b3fffa36b3f69bc53e0f46b8839dec58d89c7d82b626314
MD5 fd67fbe66d2fbe7a7d301160b263805b
BLAKE2b-256 9dcc6453604cb9022f87132e43fb5849589fcb17668f466ea57637d7e021cb1e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp314-cp314-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.3-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 8689d9e6d977d5eedb440381a3661ae1b835cef17eb48dbe05f123fcf32408ff
MD5 0f2424a199d3d49d359dbb5616f438ca
BLAKE2b-256 2644e0cc6f84d2a4c26342a196ebbe87de362a61a5f735b77a22652053baf3be

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.3-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0af8f8945355773a68ed59378af37debeff624f662b15601f41701d8de1085f0
MD5 fc2c0e2e2d3c7d7b739518263b0fdece
BLAKE2b-256 0d7a610ef42a1d2b997b4ce72865931713ee24fcef8bfaa97de1d5e6431ab416

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 845.0 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 9ad5d134baf8329eec4e1ff373663415b1ae59f3e873ae4a5162255daf306b30
MD5 c8c996607f2c271b0e6c3824431d0b71
BLAKE2b-256 54062ec6c287024b674a09c0c639959191fd78f9cd77c9a70225c0c734710335

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp313-cp313-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.3-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 11cab02487c69780d8a83a06debf544e07d3e85af7e986e52a2d60b0fd439dec
MD5 870a3d8cbf4f4d1eb3cc36489eb9e519
BLAKE2b-256 b667101da2ae5f9e8eeac83367952d57cbb796c3621ade402a070d101e2f57c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 999b646e9261ffc42d5deb287d59d7f761097e346dbeb2dfbe3e0c444b9bda68
MD5 e1f3dc6ebeacb21638637ed0e042808f
BLAKE2b-256 eec39d8eec6e00d09bba451e1a09639631b3b8b6cf2601c2034f3b6122516962

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: ambers-0.2.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 845.4 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.2.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 f9ab56dd5086d683e0ed5728727190fe18f29ae83db5ddb6297a0fd94c037b3b
MD5 2c103371ffb8363b9791f6cb4b3c17d8
BLAKE2b-256 93d00920ac2e2b8c8eabb5fad385c3284549844b571c843c9c307f4ad616b894

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp312-cp312-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.2.3-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 40b49302fe391c51aebaf0da213591e74cc485e599a855e0892cd5d678f6b22e
MD5 c1775ab284e3a67977ac022a632ec471
BLAKE2b-256 e8efc98316354e211d574ebc56517c529f1a96fcae2aeb740a85a26dde99cc7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.2.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.2.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9823d2466bf7cd7095086a21773d2b48a64033972174c4bc247f1eb674710963
MD5 84cf0ec9085196e0c4943cecba0b007a
BLAKE2b-256 d21cb7fba4d55f28381623ee91cb8fb9091085462012e49dd76e6a2126a789b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.2.3-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page