Rearrange data from a normalized CSV format to a crosstabulated format, with styling.
Project description
crosstab
crosstab rearranges data from a normalized CSV format to a crosstabulated XLSX workbook, with styling. The pivot is computed in a single pass by DuckDB and the workbook is produced by XlsxWriter, so even very large inputs crosstab in seconds. Column names containing spaces, parentheses, embedded quotes, unicode, leading digits, or SQL reserved words pass through unmodified.
Go from this:
To this:
Installation
You can install crosstab via pip from PyPI:
pip install crosstab
There is also a Docker image available on the GitHub Container Registry:
docker pull ghcr.io/geocoug/crosstab:latest
Usage
The output workbook contains:
- README — metadata about the run (timestamp, user, script version, input/output paths).
- Crosstab — the pivoted table. Row-header values are listed on the left; each distinct combination of column-header values fans out across the top, with one sub-column per requested value column.
- Source Data (optional) — a verbatim copy of the input CSV, written
when
keep_src=True.
Each of the examples below produces the same output.
Python
from pathlib import Path
from crosstab import Crosstab
Crosstab(
incsv=Path("data.csv"),
outxlsx=Path("crosstabbed_data.xlsx"),
row_headers=("location", "sample"),
col_headers=("cas_rn", "parameter"),
value_cols=("concentration", "units"),
keep_src=True,
).crosstab()
Command Line
-r, -c, and -v each accept one or more column names following the flag:
crosstab -s \
-f data.csv \
-o crosstabbed_data.xlsx \
-r location sample \
-c cas_rn parameter \
-v concentration units
Run crosstab --help for the full option list.
Docker
docker run --rm -v $(pwd):/data ghcr.io/geocoug/crosstab:latest \
-s -f /data/data.csv -o /data/crosstabbed_data.xlsx \
-r location sample \
-c cas_rn parameter \
-v concentration units
Behavior
- Strings preserved. All CSV cells are read as strings via DuckDB's
read_csv(..., all_varchar=True), so values like01and2026-05-04are not coerced to numbers or dates. - Deterministic ordering. Row keys and column keys are sorted before being written, so re-running the same input produces a byte-identical output (modulo the timestamp on the README sheet).
- Strict duplicate detection. If any
(row_key, col_key)combination appears more than once in the input, the run fails with a clearValueErrorrather than silently dropping data.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crosstab-0.2.0.tar.gz.
File metadata
- Download URL: crosstab-0.2.0.tar.gz
- Upload date:
- Size: 268.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
003111dbce8f893a337ef9d8fce71e1fadf4558f206bcbbd7c4278693fe728bc
|
|
| MD5 |
a56c73d0f2d46128d2661d0fadda7649
|
|
| BLAKE2b-256 |
12a982ee855d52152f1ecd5368b11ea0cc9ea714da47b41223c6d98f856c87f9
|
Provenance
The following attestation bundles were made for crosstab-0.2.0.tar.gz:
Publisher:
ci-cd.yaml on geocoug/crosstab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
crosstab-0.2.0.tar.gz -
Subject digest:
003111dbce8f893a337ef9d8fce71e1fadf4558f206bcbbd7c4278693fe728bc - Sigstore transparency entry: 1438627747
- Sigstore integration time:
-
Permalink:
geocoug/crosstab@eb0ced457bc669712ea7196876804838c5f6b3a5 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/geocoug
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-cd.yaml@eb0ced457bc669712ea7196876804838c5f6b3a5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file crosstab-0.2.0-py3-none-any.whl.
File metadata
- Download URL: crosstab-0.2.0-py3-none-any.whl
- Upload date:
- Size: 35.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25d3eaaa89767c8e991c692b17bf4b404879fae2c8f4179d64e51535728a305d
|
|
| MD5 |
ffd4aab21d9518880d76edaaff1b9217
|
|
| BLAKE2b-256 |
096cf869ff28fcdf5ee84bc67450da89554bc1643c0812e0e27c1031296bf016
|
Provenance
The following attestation bundles were made for crosstab-0.2.0-py3-none-any.whl:
Publisher:
ci-cd.yaml on geocoug/crosstab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
crosstab-0.2.0-py3-none-any.whl -
Subject digest:
25d3eaaa89767c8e991c692b17bf4b404879fae2c8f4179d64e51535728a305d - Sigstore transparency entry: 1438627752
- Sigstore integration time:
-
Permalink:
geocoug/crosstab@eb0ced457bc669712ea7196876804838c5f6b3a5 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/geocoug
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-cd.yaml@eb0ced457bc669712ea7196876804838c5f6b3a5 -
Trigger Event:
push
-
Statement type: