Universal dataset profiling and intelligence tool
Project description
Aniwa
See your data clearly.
Aniwa is an open-source universal dataset profiling and intelligence tool designed for developers, analysts, data engineers, researchers, and modern data teams.
Aniwa helps you instantly understand datasets through:
- schema profiling
- data quality analysis
- statistical summaries
- intelligent insights
- rich terminal reports
- shareable HTML reports
Whether you're working with CSV files, Excel spreadsheets, JSON datasets, or Parquet files, Aniwa gives you a fast and elegant way to inspect and understand data.
Why Aniwa?
Data professionals constantly work with unknown datasets.
Before trusting a dataset, people need to know:
- What columns exist?
- What data types are present?
- Are there missing values?
- Are there duplicates?
- Are there suspicious patterns?
- Which columns might contain IDs or PII?
- Is the dataset healthy?
Aniwa makes answering those questions simple.
Quick Installation
Install Aniwa from PyPI:
pip install aniwa
Verify installation:
aniwa --help
Quick Start
Profile a dataset:
aniwa customers.csv
Generate a JSON report:
aniwa customers.csv --report json --output profile.json
Generate an HTML report:
aniwa customers.csv --report html --output profile.html
Run lightweight profiling:
aniwa customers.csv --mode fast
Run full profiling:
aniwa customers.csv --mode deep
Supported Formats
Aniwa currently supports:
- CSV
- Excel (.xlsx)
- JSON
- Parquet
Features
Universal Dataset Support
Aniwa supports multiple modern dataset formats:
- CSV
- Excel
- JSON
- Parquet
Future releases will include:
- PostgreSQL
- MySQL
- DuckDB
- BigQuery
- Snowflake
Core Profiling
Aniwa provides:
Dataset Summary
- row counts
- column counts
- dataset size analysis
Schema Profiling
- type inference
- mixed type detection
- schema overview
Data Quality Analysis
- null analysis
- duplicate detection
- uniqueness analysis
- sparse column detection
Statistical Profiling
- minimum values
- maximum values
- mean
- median
- standard deviation
Intelligent Insights
- possible ID detection
- high-cardinality warnings
- sparse column warnings
- suspicious quality patterns
Reporting
Rich Terminal Reports
Aniwa uses Rich-powered terminal interfaces for beautiful developer-friendly output.
JSON Export
Machine-readable profiling results.
HTML Reports
Generate shareable profiling reports for teams, audits, and debugging workflows.
Installation
Clone the Repository
git clone https://github.com/ReginaldErzoah/Aniwa.git
cd Aniwa
Create a Virtual Environment
python -m venv .venv
Activate the environment:
Windows
source .venv/Scripts/activate
macOS/Linux
source .venv/bin/activate
Install Dependencies
pip install -r requirements.txt
Install Aniwa locally:
pip install -e .
Usage
Basic Profiling
aniwa examples/customers.csv
Generate JSON Report
aniwa examples/customers.csv --report json --output profile.json
Generate HTML Report
aniwa examples/customers.csv --report html --output profile.html
Fast Profiling Mode
aniwa examples/customers.csv --mode fast
Deep Profiling Mode
aniwa examples/customers.csv --mode deep
Example Console Output
┌──────────────────────────────┐
│ Aniwa Dataset Profile │
├──────────────────────────────┤
│ Rows: 5 │
│ Columns: 5 │
│ Duplicate Rows: 1 │
└──────────────────────────────┘
Project Structure
Aniwa/
│
├── aniwa/
│ ├── cli.py
│ ├── core/
│ ├── io/
│ ├── models/
│ ├── reports/
│ └── utils/
│
├── tests/
├── examples/
├── README.md
├── CONTRIBUTING.md
├── requirements.txt
└── pyproject.toml
Roadmap
v0.1.1 - MVP Foundation
Core Features
[x] CSV support [x] Excel support [x] JSON support [x] Parquet support [x] schema profiling [x] null analysis [x] duplicate detection [x] statistical profiling [x] console reports [x] JSON export [x] HTML reports
Developer Experience
[x] Rich terminal UI [x] fast and deep modes [x] profiling insights
v0.2.0 - Intelligence Release
- correlation analysis
- outlier detection
- semantic detection
- improved insights
- Markdown reports
v0.3.0 - Universal Connectivity
- PostgreSQL support
- MySQL support
- DuckDB support
- BigQuery support
- profiling history
- snapshot management
v0.4.0 - Extensibility
- plugin system
- custom profiling modules
- community extensions
v0.5.0 - AI Intelligence
- dataset summarization
- semantic understanding
- AI-powered recommendations
- anomaly explanations
Philosophy
Aniwa is built around a few core principles:
- universal
- developer-first
- fast
- modular
- intelligent
- beautiful
- automation-friendly
Contributing
Contributions are welcome.
See CONTRIBUTING.md for:
- development setup
- contribution guidelines
- pull request workflow
- testing instructions
License
Aniwa is released under the MIT License.
See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aniwa-0.1.1.tar.gz.
File metadata
- Download URL: aniwa-0.1.1.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d11b6eec76aa143bcbb89c6e0018117ba989ed06bc97fec8f8f244d4b4c16021
|
|
| MD5 |
2260e097deeac52ead92bcefb03028eb
|
|
| BLAKE2b-256 |
e9d9b386e8e3d1fc40747f3120de4a719975b32559ee36fcc0e2f7c86df358d3
|
File details
Details for the file aniwa-0.1.1-py3-none-any.whl.
File metadata
- Download URL: aniwa-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe2e32c97c1bc53950daf84ea9f181826261faa12adc7bae993c5da16c20dd64
|
|
| MD5 |
a97b336b2f138e8d12fb059ca4420f42
|
|
| BLAKE2b-256 |
1c02084fea1bda32db734b4c38e6298861d167975a767877f74dab26164c2357
|