Skip to main content

Rearrange data from a normalized CSV format to a crosstabulated format, with styling.

Project description

crosstab

ci/cd codecov Documentation Status PyPI Latest Release PyPI Downloads Python Version Support

crosstab rearranges data from a normalized CSV format to a crosstabulated XLSX workbook, with styling. The pivot is computed in a single pass by DuckDB and the workbook is produced by XlsxWriter, so even very large inputs crosstab in seconds. Column names containing spaces, parentheses, embedded quotes, unicode, leading digits, or SQL reserved words pass through unmodified.

Go from this:

Crosstab Input

To this:

Crosstab Output

Installation

You can install crosstab via pip from PyPI:

pip install crosstab

There is also a Docker image available on the GitHub Container Registry:

docker pull ghcr.io/geocoug/crosstab:latest

Usage

The output workbook contains:

  1. Crosstab — the pivoted table. Row-header values are listed on the left; each distinct combination of column-header values fans out across the top, with one sub-column per requested value column.
  2. Source Data (optional) — a verbatim copy of the input CSV, written when keep_src=True.

Each of the examples below produces the same output.

Python

from pathlib import Path

from crosstab import Crosstab

Crosstab(
    incsv=Path("data.csv"),
    outxlsx=Path("crosstabbed_data.xlsx"),
    row_headers=("location", "sample"),
    col_headers=("cas_rn", "parameter"),
    value_cols=("concentration", "units"),
    keep_src=True,
).crosstab()

Command Line

-r, -c, and -v each accept one or more column names following the flag:

crosstab -s \
    -f data.csv \
    -o crosstabbed_data.xlsx \
    -r location sample \
    -c cas_rn parameter \
    -v concentration units

Run crosstab --help for the full option list.

Docker

docker run --rm -v $(pwd):/data ghcr.io/geocoug/crosstab:latest \
    -s -f /data/data.csv -o /data/crosstabbed_data.xlsx \
    -r location sample \
    -c cas_rn parameter \
    -v concentration units

Behavior

  • Strings preserved. All CSV cells are read as strings via DuckDB's read_csv(..., all_varchar=True), so values like 01 and 2026-05-04 are not coerced to numbers or dates.
  • Deterministic ordering. Row keys and column keys are sorted before being written, so re-running the same input produces a byte-identical output.
  • Strict duplicate detection. If any (row_key, col_key) combination appears more than once in the input, the run fails with a clear ValueError rather than silently dropping data. Pre-aggregate the CSV with DuckDB, pandas, polars, etc. before crosstabbing if your source data has duplicates that should be combined.

Filling empty cells

By default, cells with no matching (row_key, col_key) row are left blank. Pass fill="—" (or any string) to substitute a placeholder:

crosstab --fill "N/A" \
    -f results.csv \
    -r station -c parameter -v concentration

Persisting the database

Pass keep_duckdb=True (or --keep-duckdb / -k) to save the staged input as a DuckDB database at <input>.duckdb so it can be queried again later — handy when you want to follow up the pivot with ad-hoc SQL without re-reading the CSV.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crosstab-0.3.0.tar.gz (270.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crosstab-0.3.0-py3-none-any.whl (35.6 kB view details)

Uploaded Python 3

File details

Details for the file crosstab-0.3.0.tar.gz.

File metadata

  • Download URL: crosstab-0.3.0.tar.gz
  • Upload date:
  • Size: 270.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for crosstab-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5eb0d78a7a5f996dc2313c58366be28bccdd7062ead093360ee23c71d53a5528
MD5 d27abcdcbd98af31f889b82b472def46
BLAKE2b-256 d33379bf78ef7395024bc2b427e653d0f2378377a60c76d5bb1f06196930b2c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for crosstab-0.3.0.tar.gz:

Publisher: ci-cd.yaml on geocoug/crosstab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file crosstab-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: crosstab-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 35.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for crosstab-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7c8fcde7c17ad1d34058908ada4883165e6fe4e983071d818fd51ab1e7d9e1ca
MD5 b55c279336a53089f0371d172813bd73
BLAKE2b-256 58d0f29233f1f47c0a6cf6ba33c568cda6009c4b7131f85fdcc46a6534841df0

See more details on using hashes here.

Provenance

The following attestation bundles were made for crosstab-0.3.0-py3-none-any.whl:

Publisher: ci-cd.yaml on geocoug/crosstab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page