Skip to main content

A tiny (<200 loc) CSV health + anomaly detector

Project description

datasanity

A tiny (<200 lines) CSV health-check + anomaly-detection tool.

It reports:

  • Missing value percentages
  • Duplicate row count
  • Cardinality per column
  • Basic anomaly score (z-score > 3)
  • Row + column summary

Install

pip install datasanity

Usage

from datasanity import analyze_csv, pretty_report

r = analyze_csv("data.csv")
print(pretty_report(r))

CLI Demo

python run_demo.py

Local Development

To install the package locally for development:

cd datasanity
pip install -e .

Building for Distribution

To create a distributable package:

pip install build
python -m build

This creates .whl and .tar.gz files in the dist/ directory that you can share or upload to PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasanity-0.1.0.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datasanity-0.1.0-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file datasanity-0.1.0.tar.gz.

File metadata

  • Download URL: datasanity-0.1.0.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for datasanity-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f767876b6fa982df10fe7adf15264ccb22586397f9a76c8f07a6b9d0ff1727ad
MD5 f234f6b745c752c5071179c0b1839a20
BLAKE2b-256 9e2479272a5deb98e3cb43358591348a3af1b9a49069c573598131810d0143cc

See more details on using hashes here.

File details

Details for the file datasanity-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: datasanity-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for datasanity-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfa6ad531feee475ec94c873ecf15622ba065ddc10cc7d648b6ad61e3e6a0dad
MD5 f86605aee5123495502c0e20bfb7db97
BLAKE2b-256 db47c41f251a661e189bcd1e887ae6a0dfa75edee5d7504d1ce557b18b74daee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page