๐ต๏ธโโ๏ธ Grebes: lightweight, nature-inspired data auditor
Project description
Grebes
๐ต๏ธโโ๏ธ Grebes โ A lightweight, nature-inspired data quality auditor for structured datasets.
๐ Features
- Fast, zero-config audit of CSV, Excel (
.xls/.xlsx), JSON and JSON-Lines files - Rich CLI output with colored panels, sparklines, and warnings (powered by Rich)
- Key data quality checks:
- Missing value counts & percentages
- Unique-value ratio & (optional) samples
- Numeric statistics (mean, std, min, max) & IQR-based outlier counts
- Inline histograms (sparklines) for numeric distributions
- Date-range for datetime columns
- Top-N frequencies for low-cardinality text/categorical columns
- Mixed-type detection & duplicate-row warnings
- Two modes:
- CLI:
grebes data.csvโ instant terminal report - Python API: import
GrebesAuditorinto notebooks or scripts
- CLI:
๐พ Installation
# From PyPI (when published)
pip install grebes
# Or install your local copy in editable mode for development
git clone https://github.com/yourusername/grebes.git
cd grebes
pip install -e .
Requires Python โฅ 3.7 and the following packages:
pandas,numpy,openpyxl(for Excel), andrich.
โก CLI Usage
# Basic audit of a CSV file
grebes data.csv
# Audit an Excel sheet
grebes report.xlsx
# Audit a JSON-Lines file
grebes records.jsonl
# Show help / available options
grebes --help
Sample Output
Click to expand
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ ๐ง GREBES DIAGNOSTIC REPORT โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Rows: 1,000 Cols: 5 Mem: 180.21 KB โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโโ id โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Type int64 โ
โ Missing 0 (0.0%) โ
โ Unique 1000 โ
โ Stats ฮผ=500.5,ฯ=288.8,min=1.0,max=1000.0,out=0 โ
โ Dist โโโโโโ
โโโ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโ amount โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Type float64 โ
โ Missing 0 (0.0%) โ
โ Stats ฮผ=495.4,ฯ=289.2,min=14.6,max=999.7,out=0 โ
โ Dist โโโโ
โโโโ
โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โฆ and so on for each column โฆ
๐ฆ Python API
import pandas as pd
from grebes.auditor import GrebesAuditor
df = pd.read_csv("data.csv")
auditor = GrebesAuditor(df)
auditor.print_report()
๐ How It Works
-
Reads your file (CSV, Excel, JSON(.l)) into a
pandas.DataFrame. -
Computes column-wise metrics:
- Missing values
- Unique ratio (and optional sample values for low-cardinality columns)
- Descriptive stats & outlier count for numerics
- Date ranges for datetimes
- Top frequencies for text/categorical
-
Renders an interactive, colorized report with:
- Panels per column
- Sparklines for quick distribution glance
- Warnings for mixed-type columns & duplicates
-
Zero external calls โ all local, so safe on private data.
๐ค Contributing
- Fork the repo
- Create a feature branch:
git checkout -b feat/my-awesome-feature - Commit your changes:
git commit -m "Add feature X" - Push to your branch:
git push origin feat/my-awesome-feature - Open a Pull Request
Please follow the existing code style and add tests for new functionality.
๐ License
MIT License ยฉ Your Name See LICENSE for details.
Built with ๐ and inspired by natureโs graceโlight as air, sharp as a grebeโs dive.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grebes-0.1.1.tar.gz.
File metadata
- Download URL: grebes-0.1.1.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62cea7f8923390763391cdf8fd1a58d0951755b5e6d234e11127331f6384443f
|
|
| MD5 |
f6b5a4dd040abfe098dc21812fad413e
|
|
| BLAKE2b-256 |
a383b8edfd9105e74c63d97a5fc6efe402c9d0e1e5937be832b73edf649ce53d
|
File details
Details for the file grebes-0.1.1-py3-none-any.whl.
File metadata
- Download URL: grebes-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2571225014cdc21b82ad3d602d3f42d5747fc16e056eea8645f206c1c4138464
|
|
| MD5 |
ad0b507433af8ed95a03cd44b6fe879e
|
|
| BLAKE2b-256 |
bed4de1bd04125739510d3b03409e7c622549f77a28ad32a7d261705dfe4773c
|