Skip to main content

Rearrange data from a normalized CSV format to a crosstabulated format, with styling.

Project description

crosstab

ci/cd Documentation Status PyPI Latest Release PyPI Downloads Python Version Support

crosstab rearranges data from a normalized CSV format to a crosstabulated XLSX workbook, with styling. The pivot is computed in a single pass by DuckDB and the workbook is produced by XlsxWriter, so even very large inputs crosstab in seconds. Column names containing spaces, parentheses, embedded quotes, unicode, leading digits, or SQL reserved words pass through unmodified.

Go from this:

Crosstab Input

To this:

Crosstab Output

Installation

You can install crosstab via pip from PyPI:

pip install crosstab

There is also a Docker image available on the GitHub Container Registry:

docker pull ghcr.io/geocoug/crosstab:latest

Usage

The output workbook contains:

  1. README — metadata about the run (timestamp, user, script version, input/output paths).
  2. Crosstab — the pivoted table. Row-header values are listed on the left; each distinct combination of column-header values fans out across the top, with one sub-column per requested value column.
  3. Source Data (optional) — a verbatim copy of the input CSV, written when keep_src=True.

Each of the examples below produces the same output.

Python

from pathlib import Path

from crosstab import Crosstab

Crosstab(
    incsv=Path("data.csv"),
    outxlsx=Path("crosstabbed_data.xlsx"),
    row_headers=("location", "sample"),
    col_headers=("cas_rn", "parameter"),
    value_cols=("concentration", "units"),
    keep_src=True,
).crosstab()

Command Line

-r, -c, and -v each accept one or more column names following the flag:

crosstab -s \
    -f data.csv \
    -o crosstabbed_data.xlsx \
    -r location sample \
    -c cas_rn parameter \
    -v concentration units

Run crosstab --help for the full option list.

Docker

docker run --rm -v $(pwd):/data ghcr.io/geocoug/crosstab:latest \
    -s -f /data/data.csv -o /data/crosstabbed_data.xlsx \
    -r location sample \
    -c cas_rn parameter \
    -v concentration units

Behavior

  • Strings preserved. All CSV cells are read as strings via DuckDB's read_csv(..., all_varchar=True), so values like 01 and 2026-05-04 are not coerced to numbers or dates.
  • Deterministic ordering. Row keys and column keys are sorted before being written, so re-running the same input produces a byte-identical output (modulo the timestamp on the README sheet).
  • Strict duplicate detection. If any (row_key, col_key) combination appears more than once in the input, the run fails with a clear ValueError rather than silently dropping data.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crosstab-0.2.1.tar.gz (268.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crosstab-0.2.1-py3-none-any.whl (35.7 kB view details)

Uploaded Python 3

File details

Details for the file crosstab-0.2.1.tar.gz.

File metadata

  • Download URL: crosstab-0.2.1.tar.gz
  • Upload date:
  • Size: 268.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for crosstab-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e611d24f4a35b1b6eeea5cbc11dd23152ed625f1bdb9c033c8da7058e85bb47f
MD5 e5f7d324c04d1b44297ae2b657e0001f
BLAKE2b-256 957ae0e51cdd1ef10a65e5aca7202bb7b544d3e71bb235f4d4630e5b9aeabd6a

See more details on using hashes here.

Provenance

The following attestation bundles were made for crosstab-0.2.1.tar.gz:

Publisher: ci-cd.yaml on geocoug/crosstab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file crosstab-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: crosstab-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 35.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for crosstab-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0ae55e5906e1998ff3ab1761326b1f76024305117679c30ebd6a2da550e3e4b3
MD5 8e330de8e92afdd7848bee8ef2820009
BLAKE2b-256 b19736b64c7f1671f5b87a0251c8443c0c5adda94e990dfbdc5afcce6575f66e

See more details on using hashes here.

Provenance

The following attestation bundles were made for crosstab-0.2.1-py3-none-any.whl:

Publisher: ci-cd.yaml on geocoug/crosstab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page