Skip to main content

Pure Rust SPSS .sav/.zsav reader with Polars DataFrame output

Project description

ambers

ambers banner

Crates.io PyPI License: MIT

Pure Rust SPSS .sav/.zsav reader — Arrow-native, zero C dependencies.

Features

  • Read .sav (bytecode) and .zsav (zlib) files
  • Arrow RecordBatch output — zero-copy to Polars, DataFusion, DuckDB
  • Rich metadata: variable labels, value labels, missing values, MR sets, measure levels
  • Lazy reader via scan_sav() — returns Polars LazyFrame with projection and row limit pushdown
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer
  • One of the fastest SPSS readers — up to 2.5x faster than polars_readstat, 5–10x faster than pyreadstat
  • Python + Rust dual API from a single crate

Installation

Python:

pip install ambers

Rust:

cargo add ambers

Quick Start

Python

import ambers as am

# Eager read — data + metadata
df, meta = am.read_sav("survey.sav")

# Lazy read — returns Polars LazyFrame
lf, meta = am.scan_sav("survey.sav")
df = lf.select(["Q1", "Q2", "age"]).head(1000).collect()

# Explore metadata
meta.summary()
meta.describe("Q1")
meta.value("Q1")

# Read metadata only (fast, skips data)
meta = am.read_sav_metadata("survey.sav")

Rust

use ambers::{read_sav, read_sav_metadata};

// Read data + metadata
let (batch, meta) = read_sav("survey.sav")?;
println!("{} rows, {} cols", batch.num_rows(), meta.number_columns);

// Read metadata only
let meta = read_sav_metadata("survey.sav")?;
println!("{}", meta.label("Q1").unwrap_or("(no label)"));

Metadata API (Python)

Method Description
meta.summary() Formatted overview: file info, type distribution, annotations
meta.describe("Q1") Deep-dive into a single variable (or list of variables)
meta.diff(other) Compare two metadata objects, returns MetaDiff
meta.label("Q1") Variable label
meta.value("Q1") Value labels dict
meta.format("Q1") SPSS format string (e.g. "F8.2", "A50")
meta.measure("Q1") Measurement level ("nominal", "ordinal", "scale")
meta.schema Full metadata as a nested Python dict

All variable-name methods raise KeyError for unknown variables.

Streaming Reader (Rust)

let mut scanner = ambers::scan_sav("survey.sav")?;
scanner.select(&["age", "gender"])?;
scanner.limit(1000);

while let Some(batch) = scanner.next_batch()? {
    println!("Batch: {} rows", batch.num_rows());
}

Performance

Eager Read

All results return a Polars DataFrame. Average of 5 runs on Windows 11, Python 3.13, 24-core machine.

File Size Rows Cols ambers polars_readstat ambers vs prs pyreadstat pyreadstat mp (4w) ambers vs pyreadstat
test_1 (bytecode) 0.2 MB 1,500 75 0.002s 0.004s 2.0x faster 0.010s 0.493s 5.0x faster
test_2 (bytecode) 147 MB 22,070 677 0.812s 0.991s 1.2x faster 3.564s 1.781s 4.4x faster
test_3 (uncompressed) 1.1 GB 79,066 915 0.509s 1.279s 2.5x faster 4.849s 2.764s 9.5x faster
test_4 (uncompressed) 0.6 MB 201 158 0.002s 0.004s 2.0x faster 0.018s 0.470s 9.0x faster
test_5 (uncompressed) 0.6 MB 203 136 0.002s 0.004s 2.0x faster 0.015s 0.454s 7.5x faster
test_6 (uncompressed) 5.4 GB 395,330 916 2.801s 1.809s 1.5x slower 24.199s 11.718s 8.6x faster
  • vs polars_readstat: faster on 5 of 6 files — 1.2–2.5x faster (test_6 at 5.4 GB is 1.5x slower)
  • vs pyreadstat: 4–10x faster across all file sizes
  • vs pyreadstat multiprocess (4 workers): ambers single-threaded still faster on every file
  • No PyArrow dependency — uses Arrow PyCapsule Interface for zero-copy transfer

pyreadstat multiprocess returns pandas; timing includes pl.from_pandas() conversion.

Lazy Read with Pushdown

scan_sav() returns a Polars LazyFrame. Unlike eager reads, it only reads the data you ask for:

File (size) Full collect Select 5 cols Head 1000 rows Select 5 + head 1000
test_2 (147 MB, 22K × 677) 0.903s 0.363s (2.5x) 0.181s (5.0x) 0.157s (5.7x)
test_3 (1.1 GB, 79K × 915) 0.700s 0.554s (1.3x) 0.020s (35x) 0.012s (58x)
test_6 (5.4 GB, 395K × 916) 3.062s 2.343s (1.3x) 0.022s (139x) 0.013s (236x)

On the 5.4 GB file, selecting 5 columns and 1000 rows completes in 13ms — 236x faster than reading the full dataset.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ambers-0.1.8.tar.gz (69.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ambers-0.1.8-cp314-cp314-win_amd64.whl (758.0 kB view details)

Uploaded CPython 3.14Windows x86-64

ambers-0.1.8-cp314-cp314-manylinux_2_34_x86_64.whl (906.0 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

ambers-0.1.8-cp314-cp314-macosx_11_0_arm64.whl (828.3 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ambers-0.1.8-cp313-cp313-win_amd64.whl (759.8 kB view details)

Uploaded CPython 3.13Windows x86-64

ambers-0.1.8-cp313-cp313-manylinux_2_34_x86_64.whl (906.2 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

ambers-0.1.8-cp313-cp313-macosx_11_0_arm64.whl (828.1 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ambers-0.1.8-cp312-cp312-win_amd64.whl (760.2 kB view details)

Uploaded CPython 3.12Windows x86-64

ambers-0.1.8-cp312-cp312-manylinux_2_34_x86_64.whl (906.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

ambers-0.1.8-cp312-cp312-macosx_11_0_arm64.whl (828.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file ambers-0.1.8.tar.gz.

File metadata

  • Download URL: ambers-0.1.8.tar.gz
  • Upload date:
  • Size: 69.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.8.tar.gz
Algorithm Hash digest
SHA256 7c1343fef6ade30db7337db2f5bb5412b36fa42d33227cf28060dc7cd8ee88cf
MD5 e89d28cc09659fb0bcf32713c6856880
BLAKE2b-256 46adc6cdda9bb9f6dd6604501de063402a759acea562931b6520671d38adc740

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8.tar.gz:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: ambers-0.1.8-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 758.0 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.8-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 101c83e79b5f683b0b0e5b4b90beb7b18e40991f299b6fbe46e7de42d45ffaef
MD5 25b0d19e2cec9fa27614f1b5055ba8f1
BLAKE2b-256 255264a3fc4e3024f188dda28fbd35c5d1ea1766eee7e8245faa87a064a95a81

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp314-cp314-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.1.8-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0b8808f21a7e994869612afa5f2fe3e4d1881047649d3562c05987b21bc20ff8
MD5 b0e1c90b0eae58f7e3384e62f84d1a10
BLAKE2b-256 6519fd7585473b3f6c88923f0ff3ff08fd6136068a4d0dda7238f76baaa6de50

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.1.8-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6562d24509119864b0f047a589b7cddde48e01d42675fa531ddf0e8c1e54b61e
MD5 8c15f2547d47d6c845e32911143f45ad
BLAKE2b-256 fb94258df05c74c97b122eba52522813712392b3baaa49fd5656c30d5a0cf495

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: ambers-0.1.8-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 759.8 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.8-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 69bd1a7a31fbf3e198057e133d221bccc47418506179a125fc7ce71b980bd043
MD5 df57a415a5c4e38fb1028a426039623c
BLAKE2b-256 5cbf4c7ba060cf77fde77483f7c235b2473e5aa1d9d76b3dcebb0885d4913829

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp313-cp313-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.1.8-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bc016d1073ac376d5c011224bc47100365283c4dfa6aa42309e6949f62b15bd8
MD5 ed0685060577e4f2d4894077d0d73d50
BLAKE2b-256 56229cc962bf3d73baa93987bd2e0f54d6d64b8822c7e2ffc768b73ff4d66f67

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.1.8-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 aff4d4323af3539d021879e58d52b9b9db3851df608ec71e7795d287295e7094
MD5 a8585ed48c55e992ac7c64f7a18253f0
BLAKE2b-256 2848e36105b8a16cfa55a1341ef78c01d7258f8f54eb5a762c71284cc9a905a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: ambers-0.1.8-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 760.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ambers-0.1.8-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 46c12404334509c817c7e94a55282d617d42e2a434f60572a916851db861e66f
MD5 79c9b96f3c422fb9344ea1cd9140c640
BLAKE2b-256 1a620bff9745c010a8853194b2c110301d5f324236c6fc989df60e4b5cf66cc8

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp312-cp312-win_amd64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ambers-0.1.8-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 2e89356d284c3b021a9e895989fce8864a67a585f6236f04814477bc8360aa7a
MD5 e6717ed5c97b4c47051d03b481e0af64
BLAKE2b-256 b3b22ba00ae3838295033aedba329d17b9cb1fc6c4be1bf731e3cdcd7b419282

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ambers-0.1.8-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ambers-0.1.8-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c60793156aa190b7d292cc9197818a97626a6dda7f9312409022a68d0555d273
MD5 1cc365566c43a81927744a249c49bafb
BLAKE2b-256 3e250b68e80d7278e5733a664b38d421b3b7da58ce950b3654636625f5341235

See more details on using hashes here.

Provenance

The following attestation bundles were made for ambers-0.1.8-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on albertxli/ambers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page