Skip to main content

Thin Polars IO plugin for the sas7bdat crate

Project description

sas7bdat-polars

A Polars IO plugin for reading SAS7BDAT files, backed by the SIMD-accelerated sas7bdat Rust parser. It registers a native IO source via polars.io.plugins.register_io_source, so scans are lazy and support projection and predicate pushdown straight into the reader.

Installation

pip install sas7bdat-polars

Version constraints

This wheel is tightly coupled to its build environment:

  • Polars is pinned to 1.41.*. The extension shares the Polars Rust ABI (via polars-ffi) with the in-process polars package, so the installed polars must match the version the wheel was built against. A mismatch is undefined behavior, not a graceful error.
  • Built against the CPython stable ABI (abi3, minimum 3.12), so a single cp312-abi3 wheel runs on CPython 3.12 and newer.

Usage

import polars as pl
import sas7bdat_polars as sp

# Lazy scan — returns a LazyFrame; filters/projections push down into the reader.
lf = sp.scan_sas("data.sas7bdat")
df = lf.filter(pl.col("age") > 30).select("name", "age").collect()

# Hydrate value labels from a companion catalog.
lf = sp.scan_sas("data.sas7bdat", catalog_path="formats.sas7bcat")

# Inspect the Arrow schema without reading rows.
schema = sp.schema_for_file("data.sas7bdat")

# Return character columns as Categorical (low-cardinality category codes).
lf = sp.scan_sas("survey.sas7bdat", categorical=True)

# SAS stores every numeric column as a float. Declare integer-coded columns
# (registry/category codes) explicitly to get Int64 out instead of Float64:
lf = sp.scan_sas(
    "bef2020.sas7bdat",
    schema_overrides={"KOEN": pl.Int64, "SOCIO13": pl.Int64, "HFAUDD": pl.Int64},
)

categorical=True casts every character column to Categorical in the lazy plan (via Polars' own cast — equivalent to sp.scan_sas(path).with_columns(pl.col(pl.String).cast(pl.Categorical))). The benefit is downstream: group-by / join / sort on these columns run on u32 codes and are ~10–15× faster. It is not a read or memory win — Polars' String is already compact, so casting adds a little to the read (~0.6s on a 2.5k-string- column file) and uses more memory; only enable it when you'll group/join on the string columns. (Contrast with the R binding's categorical=TRUE, where factor is a read-speed and memory win.)

schema_overrides is applied at schema time, so the lazy schema and the collected frame always agree, and the same override map yields the same dtypes for every file of a register. Override names that don't exist in a given file are ignored, so a register-wide map can be passed wholesale. If a file contains a value that violates an Int64 override (non-integral or out of range), the scan fails with an error naming the column, row, and value — it never silently falls back to Float64. Supported override dtypes: Int64, Float64, Date, Datetime, Time, String, Binary (numeric columns can only be re-typed to numeric/temporal dtypes, character columns to String/Binary). Feature-detect with sp.PLUGIN_CONTRACT_VERSION >= "sas7bdat_polars.v2".

License

MIT — see the repository for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sas7bdat_polars-0.3.0-cp312-abi3-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.12+Windows x86-64

sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.28+ x86-64

sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_aarch64.whl (5.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.28+ ARM64

sas7bdat_polars-0.3.0-cp312-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

File details

Details for the file sas7bdat_polars-0.3.0-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: sas7bdat_polars-0.3.0-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sas7bdat_polars-0.3.0-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 0775c46748f17990fe88661adfd0e2ae5f0cc271a2460491e0061a743ae2b768
MD5 8afd222538902fd165909af37eb0a1d6
BLAKE2b-256 08654fd1fee63b0e2a82703645e69a6d2dd2120d0271a821c02a22a7ff07fac3

See more details on using hashes here.

File details

Details for the file sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_x86_64.whl.

File metadata

  • Download URL: sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_x86_64.whl
  • Upload date:
  • Size: 5.8 MB
  • Tags: CPython 3.12+, manylinux: glibc 2.28+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8e56f03b648f47369bc657850d5339b36a78ff3f72b814bf1af8e4c2ea856806
MD5 a677d407beb325430693376905f78c54
BLAKE2b-256 15dfbecafc6e695c9aff4f107390867b2e7d462a63b1ba536988665056ab3cae

See more details on using hashes here.

File details

Details for the file sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_aarch64.whl.

File metadata

  • Download URL: sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_aarch64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.12+, manylinux: glibc 2.28+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sas7bdat_polars-0.3.0-cp312-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b4faae03f17809f686b9a1eb170c6354ad00eec6dbcf6220109e72bba2baf3f3
MD5 8a3a378152b82a2954862787e810f42d
BLAKE2b-256 7f5934155895b01e8b38334c469b3fc436665753730ed0d1f90c170232c0e708

See more details on using hashes here.

File details

Details for the file sas7bdat_polars-0.3.0-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: sas7bdat_polars-0.3.0-cp312-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: CPython 3.12+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sas7bdat_polars-0.3.0-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ffdc8c7371a8fb0eb5dc72c1d8668d6e8a4b9783728f75d16861667195e59974
MD5 b60bbaea839fbcae661827c088ae59b4
BLAKE2b-256 99e9574f30184ae9769e15b841744020fad0632b12dda283ff753231d286985b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page