Skip to main content

Eyeball a CSV in seconds: rows × columns, per-column type, null %, ranges, and top values. Zero dependencies, no pandas.

Project description

csvsight

Eyeball a CSV in seconds. Someone drops a CSV export on you and you just want to know: how many rows, what columns, which ones are mostly empty, what's the range of that amount field? csvsight prints exactly that. Zero dependencies — no pandas, no REPL.

csvsight data.csv
data.csv — 10,432 rows × 5 columns  (comma, utf-8)

#  column   type    nulls         unique  detail
─  ───────  ──────  ────────────  ──────  ─────────────────────────────────────────
1  id       int     0 (0.0%)      10,432  min 1 · max 10432 · mean 5216.5
2  email    string  12 (0.1%)     10,411  e.g. "ada@example.com" · len 9–48
3  amount   float   34 (0.3%)     2,015   min 0.01 · max 9999 · mean 42.3
4  status   string  0 (0.0%)      3       active (61%) · churned (28%) · trial (11%)
5  country  string  120 (1.1%)    47      US (40%) · GB (12%) · DE (7%)

Why

You don't need a DataFrame to answer "what's in this file?" — but the usual tools make you spin one up anyway:

  • pandas means pip install pandas, a Python session, and remembering the API for .describe() / .isna().sum() / .nunique().
  • csvkit is lovely but pulls in a handful of dependencies.
  • Excel chokes on big files and isn't in your terminal.

csvsight is one command on a CSV. It auto-detects the delimiter, infers each column's type, counts 10+ spellings of "missing" (NULL, N/A, nan, -, none, empty, …), and shows ranges for numbers and value distributions for categorical columns.

Usage

csvsight data.csv              # profile a file
cat data.csv | csvsight        # or read from stdin
csvsight data.tsv              # delimiter auto-detected (, tab ; |)
Option
--delimiter <c> force the field delimiter
--no-header treat the first row as data (columns named col1, col2, …)
--top <n> top N values for categorical columns (default 3)
--json emit the analysis as JSON instead of the table

What it reports per column

  • typeint / float / string (inferred from the non-null values)
  • nulls — count and percentage, recognizing many "missing" spellings
  • unique — distinct non-null values
  • detail — numbers get min · max · mean; low-cardinality columns get their value distribution; free-text columns get an example and length range

Install

pip install csvsight        # Python >= 3.8
npx csvsight data.csv       # Node >= 18 (byte-for-byte port)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvsight-0.1.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csvsight-0.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file csvsight-0.1.0.tar.gz.

File metadata

  • Download URL: csvsight-0.1.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for csvsight-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8707573c27012a237ff087cd6074226cf714241245c112124b10a1a50018cf14
MD5 9cf5a66e75d57acf7965f9ac275a9b15
BLAKE2b-256 7910b0bcc6c32702f612000add993caed93c40cbd58f9626364a80fec2f92a3b

See more details on using hashes here.

File details

Details for the file csvsight-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: csvsight-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for csvsight-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4bc6a9ac87d79627b1c00935572055a689e5f0bfe6309b4468ab6932093a291
MD5 841554cb347f8189ae9bcfab725da3b7
BLAKE2b-256 374dd2476d4c6951e64769743bf887d6a84ff6b4e53b13eda94af615fd85d0e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page