Skip to main content

A tool for analyzing and describing CSV files

Project description

DescribeCSV

A Python tool for analyzing and describing CSV files. It provides detailed information about file structure, data types, missing values, and statistical summaries.

Features

  • Automatic encoding detection
  • Handles large files through chunked processing
  • Detailed column analysis including:
    • Data types
    • Missing values
    • Unique value counts
    • Statistical summaries for numeric columns
    • Top values for categorical columns
  • Detection of numeric data stored as strings
  • Duplicate row detection
  • File metadata information

Installation

pip install describecsv

Usage

From the command line:

describecsv path/to/your/file.csv

This will create a JSON file with the analysis results in the same directory as your CSV file.

Output

The tool generates a detailed JSON report including:

  • Basic file information (size, encoding, etc.)
  • Row and column counts
  • Missing value analysis
  • Column-by-column analysis including:
    • Data types
    • Unique values
    • Missing values
    • Statistical summaries for numeric columns
    • Most common values for categorical columns
    • Suggestions for data quality improvements

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

describecsv_nc-0.1.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

describecsv_nc-0.1.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file describecsv_nc-0.1.0.tar.gz.

File metadata

  • Download URL: describecsv_nc-0.1.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for describecsv_nc-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0e57a1c15c8c2f6d368495de0f1b304818ef3779bc9f49529a1110d102e3612e
MD5 aeed0164c509b64eba3168239d5c067d
BLAKE2b-256 4e6d7bfa48767765586544e4ab0667b58c65923052e908cee9135d6a29b19a65

See more details on using hashes here.

File details

Details for the file describecsv_nc-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: describecsv_nc-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for describecsv_nc-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3e0da214dc503f4c3e1fe0a2935cfce04b526aba5359e6e844808d19693edc36
MD5 b16ca3deb1b7642118db9a3a867ea7e6
BLAKE2b-256 d9e305ae5e676a27d95d7fc88676f7c27d7bb886ffe19ca02eff9de9f5728ab3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page