Skip to main content

Rearrange data from a normalized CSV format to a crosstabulated format, with styling.

Project description

crosstab

ci/cd Documentation Status PyPI Latest Release PyPI Downloads Python Version Support

crosstab rearranges data from a normalized CSV format to a crosstabulated XLSX workbook, with styling. The pivot is computed in a single pass by DuckDB and the workbook is produced by XlsxWriter, so even very large inputs crosstab in seconds. Column names containing spaces, parentheses, embedded quotes, unicode, leading digits, or SQL reserved words pass through unmodified.

Go from this:

Crosstab Input

To this:

Crosstab Output

Installation

You can install crosstab via pip from PyPI:

pip install crosstab

There is also a Docker image available on the GitHub Container Registry:

docker pull ghcr.io/geocoug/crosstab:latest

Usage

The output workbook contains:

  1. README — metadata about the run (timestamp, user, script version, input/output paths).
  2. Crosstab — the pivoted table. Row-header values are listed on the left; each distinct combination of column-header values fans out across the top, with one sub-column per requested value column.
  3. Source Data (optional) — a verbatim copy of the input CSV, written when keep_src=True.

Each of the examples below produces the same output.

Python

from pathlib import Path

from crosstab import Crosstab

Crosstab(
    incsv=Path("data.csv"),
    outxlsx=Path("crosstabbed_data.xlsx"),
    row_headers=("location", "sample"),
    col_headers=("cas_rn", "parameter"),
    value_cols=("concentration", "units"),
    keep_src=True,
).crosstab()

Command Line

-r, -c, and -v each accept one or more column names following the flag:

crosstab -s \
    -f data.csv \
    -o crosstabbed_data.xlsx \
    -r location sample \
    -c cas_rn parameter \
    -v concentration units

Run crosstab --help for the full option list.

Docker

docker run --rm -v $(pwd):/data ghcr.io/geocoug/crosstab:latest \
    -s -f /data/data.csv -o /data/crosstabbed_data.xlsx \
    -r location sample \
    -c cas_rn parameter \
    -v concentration units

Behavior

  • Strings preserved. All CSV cells are read as strings via DuckDB's read_csv(..., all_varchar=True), so values like 01 and 2026-05-04 are not coerced to numbers or dates.
  • Deterministic ordering. Row keys and column keys are sorted before being written, so re-running the same input produces a byte-identical output (modulo the timestamp on the README sheet).
  • Strict duplicate detection. If any (row_key, col_key) combination appears more than once in the input, the run fails with a clear ValueError rather than silently dropping data.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crosstab-0.2.0.tar.gz (268.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crosstab-0.2.0-py3-none-any.whl (35.7 kB view details)

Uploaded Python 3

File details

Details for the file crosstab-0.2.0.tar.gz.

File metadata

  • Download URL: crosstab-0.2.0.tar.gz
  • Upload date:
  • Size: 268.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for crosstab-0.2.0.tar.gz
Algorithm Hash digest
SHA256 003111dbce8f893a337ef9d8fce71e1fadf4558f206bcbbd7c4278693fe728bc
MD5 a56c73d0f2d46128d2661d0fadda7649
BLAKE2b-256 12a982ee855d52152f1ecd5368b11ea0cc9ea714da47b41223c6d98f856c87f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for crosstab-0.2.0.tar.gz:

Publisher: ci-cd.yaml on geocoug/crosstab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file crosstab-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: crosstab-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 35.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for crosstab-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25d3eaaa89767c8e991c692b17bf4b404879fae2c8f4179d64e51535728a305d
MD5 ffd4aab21d9518880d76edaaff1b9217
BLAKE2b-256 096cf869ff28fcdf5ee84bc67450da89554bc1643c0812e0e27c1031296bf016

See more details on using hashes here.

Provenance

The following attestation bundles were made for crosstab-0.2.0-py3-none-any.whl:

Publisher: ci-cd.yaml on geocoug/crosstab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page