Skip to main content

A tool for analyzing and describing CSV files

Project description

DescribeCSV

A Python tool for analyzing and describing CSV files. It provides detailed information about file structure, data types, missing values, and statistical summaries.

Features

  • Automatic encoding detection
  • Handles large files through chunked processing
  • Detailed column analysis including:
    • Data types
    • Missing values
    • Unique value counts
    • Statistical summaries for numeric columns
    • Top values for categorical columns
  • Detection of numeric data stored as strings
  • Duplicate row detection
  • File metadata information

Installation

pip install describecsv

Usage

From the command line:

describecsv path/to/your/file.csv

This will create a JSON file with the analysis results in the same directory as your CSV file.

Output

The tool generates a detailed JSON report including:

  • Basic file information (size, encoding, etc.)
  • Row and column counts
  • Missing value analysis
  • Column-by-column analysis including:
    • Data types
    • Unique values
    • Missing values
    • Statistical summaries for numeric columns
    • Most common values for categorical columns
    • Suggestions for data quality improvements

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

describecsv-0.1.1.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

describecsv-0.1.1-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file describecsv-0.1.1.tar.gz.

File metadata

  • Download URL: describecsv-0.1.1.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for describecsv-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2a376f9a42854c5cd8d4562d8e1605542bb05deadb56002f0cebe52f1eaa92ca
MD5 e5e42a2bf2213c48b1bdc6aa9d6cc10e
BLAKE2b-256 5f881d3fa3fd42a267c5fe6f129b1e8e87aa58d341ee82dd80038de9b2423875

See more details on using hashes here.

File details

Details for the file describecsv-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: describecsv-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for describecsv-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 661e80504eb8246cb5dc7ee8861e6e6bbebbf2fbee3fd3e4e5f45c387c0a363a
MD5 c920ea8e9b1998d2a1aa78470d2c9e04
BLAKE2b-256 a12e2bbe5843820bbf3bd387e5204068e60c3d6b76ce257629faf7f8eacf72cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page