Skip to main content

Universal dataset profiling and intelligence tool

Project description

Aniwa

Aniwa Logo

See your data clearly.

Aniwa is an open-source universal dataset profiling and intelligence tool designed for developers, analysts, data engineers, researchers, and modern data teams.

Aniwa helps you instantly understand datasets through:

  • schema profiling
  • data quality analysis
  • statistical summaries
  • intelligent insights
  • rich terminal reports
  • shareable HTML reports

Whether you're working with CSV files, Excel spreadsheets, JSON datasets, or Parquet files, Aniwa gives you a fast and elegant way to inspect and understand data.


Why Aniwa?

Data professionals constantly work with unknown datasets.

Before trusting a dataset, people need to know:

  • What columns exist?
  • What data types are present?
  • Are there missing values?
  • Are there duplicates?
  • Are there suspicious patterns?
  • Which columns might contain IDs or PII?
  • Is the dataset healthy?

Aniwa makes answering those questions simple.


Quick Installation

Install Aniwa from PyPI:

pip install aniwa

Verify installation:

aniwa --help

Quick Start

Profile a dataset:

aniwa customers.csv

Generate a JSON report:

aniwa customers.csv --report json --output profile.json

Generate an HTML report:

aniwa customers.csv --report html --output profile.html

Run lightweight profiling:

aniwa customers.csv --mode fast

Run full profiling:

aniwa customers.csv --mode deep

Supported Formats

Aniwa currently supports:

  • CSV
  • Excel (.xlsx)
  • JSON
  • Parquet

Features

Universal Dataset Support

Aniwa supports multiple modern dataset formats:

  • CSV
  • Excel
  • JSON
  • Parquet

Future releases will include:

  • PostgreSQL
  • MySQL
  • DuckDB
  • BigQuery
  • Snowflake

Core Profiling

Aniwa provides:

Dataset Summary

  • row counts
  • column counts
  • dataset size analysis

Schema Profiling

  • type inference
  • mixed type detection
  • schema overview

Data Quality Analysis

  • null analysis
  • duplicate detection
  • uniqueness analysis
  • sparse column detection

Statistical Profiling

  • minimum values
  • maximum values
  • mean
  • median
  • standard deviation

Intelligent Insights

  • possible ID detection
  • high-cardinality warnings
  • sparse column warnings
  • suspicious quality patterns

Reporting

Rich Terminal Reports

Aniwa uses Rich-powered terminal interfaces for beautiful developer-friendly output.

JSON Export

Machine-readable profiling results.

HTML Reports

Generate shareable profiling reports for teams, audits, and debugging workflows.


Installation

Clone the Repository

git clone https://github.com/ReginaldErzoah/Aniwa.git
cd Aniwa

Create a Virtual Environment

python -m venv .venv

Activate the environment:

Windows

source .venv/Scripts/activate

macOS/Linux

source .venv/bin/activate

Install Dependencies

pip install -r requirements.txt

Install Aniwa locally:

pip install -e .

Usage

Basic Profiling

aniwa examples/customers.csv

Generate JSON Report

aniwa examples/customers.csv --report json --output profile.json

Generate HTML Report

aniwa examples/customers.csv --report html --output profile.html

Fast Profiling Mode

aniwa examples/customers.csv --mode fast

Deep Profiling Mode

aniwa examples/customers.csv --mode deep

Example Console Output

┌──────────────────────────────┐
│      Aniwa Dataset Profile   │
├──────────────────────────────┤
│ Rows: 5                      │
│ Columns: 5                   │
│ Duplicate Rows: 1            │
└──────────────────────────────┘

Project Structure

Aniwa/
│
├── aniwa/
│   ├── cli.py
│   ├── core/
│   ├── io/
│   ├── models/
│   ├── reports/
│   └── utils/
│
├── tests/
├── examples/
├── README.md
├── CONTRIBUTING.md
├── requirements.txt
└── pyproject.toml

Roadmap

v0.1.1 - MVP Foundation

Core Features

[x] CSV support [x] Excel support [x] JSON support [x] Parquet support [x] schema profiling [x] null analysis [x] duplicate detection [x] statistical profiling [x] console reports [x] JSON export [x] HTML reports

Developer Experience

[x] Rich terminal UI [x] fast and deep modes [x] profiling insights


v0.2.0 - Intelligence Release

  • correlation analysis
  • outlier detection
  • semantic detection
  • improved insights
  • Markdown reports

v0.3.0 - Universal Connectivity

  • PostgreSQL support
  • MySQL support
  • DuckDB support
  • BigQuery support
  • profiling history
  • snapshot management

v0.4.0 - Extensibility

  • plugin system
  • custom profiling modules
  • community extensions

v0.5.0 - AI Intelligence

  • dataset summarization
  • semantic understanding
  • AI-powered recommendations
  • anomaly explanations

Philosophy

Aniwa is built around a few core principles:

  • universal
  • developer-first
  • fast
  • modular
  • intelligent
  • beautiful
  • automation-friendly

Contributing

Contributions are welcome.

See CONTRIBUTING.md for:

  • development setup
  • contribution guidelines
  • pull request workflow
  • testing instructions

License

Aniwa is released under the MIT License.

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aniwa-0.1.1.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aniwa-0.1.1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file aniwa-0.1.1.tar.gz.

File metadata

  • Download URL: aniwa-0.1.1.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for aniwa-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d11b6eec76aa143bcbb89c6e0018117ba989ed06bc97fec8f8f244d4b4c16021
MD5 2260e097deeac52ead92bcefb03028eb
BLAKE2b-256 e9d9b386e8e3d1fc40747f3120de4a719975b32559ee36fcc0e2f7c86df358d3

See more details on using hashes here.

File details

Details for the file aniwa-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: aniwa-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for aniwa-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fe2e32c97c1bc53950daf84ea9f181826261faa12adc7bae993c5da16c20dd64
MD5 a97b336b2f138e8d12fb059ca4420f42
BLAKE2b-256 1c02084fea1bda32db734b4c38e6298861d167975a767877f74dab26164c2357

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page