megaprofiler is a highly customizable and extensible data profiling library designed to help data scientists and engineers understand their datasets before performing analysis or building models.
Project description
When working with large datasets, it’s often necessary to understand data types, distributions, and potential issues (e.g., missing values, outliers) before analysis. While libraries like pandas-profiling exist, there is still room for an extensible, easy-to-use, and highly customizable profiler that integrates data validation.
Key Features: Automatic Data Summaries: Provide insights like distribution, unique values, missing values, and more for each column. Anomaly Detection: Automatically flag columns or rows with unusual distributions, outliers, or inconsistent data. Data Validation: Set validation rules (e.g., no missing values in specific columns, data type constraints) and get alerts if the data violates these rules. Custom Reports: Generate visual reports (e.g., HTML, PDF) with configurable thresholds for what counts as an anomaly. Data Drift Detection: Track changes in data distributions over time to identify shifts in data quality or content. Benefits: DataProfiler would be invaluable to data scientists and engineers dealing with exploratory data analysis, data quality checks, and ETL pipelines, reducing manual data investigation.
To Use :
'from megaprofiler import MegaProfiler'
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file megaprofiler-0.2.2.tar.gz.
File metadata
- Download URL: megaprofiler-0.2.2.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d01f686d2b3efc5732cead40c2d002bf5c36d766f3547aabe6e3d749908ab1bf
|
|
| MD5 |
973c9178c382d410c6bc145cb1020c8c
|
|
| BLAKE2b-256 |
5b663045c9e497aba5d8afaad37e9cab94a142132dd0850107135e0cad89618a
|
File details
Details for the file megaprofiler-0.2.2-py3-none-any.whl.
File metadata
- Download URL: megaprofiler-0.2.2-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fa2b5820a0e05a1672da72d3fe6d0aa358ca32b656bca504220818fe46e7987
|
|
| MD5 |
a19611cb37e7c045ed29b080062a8edb
|
|
| BLAKE2b-256 |
dfef49c32fc40aeb4b017e8b3cae82cdc851a652580c747a3e2aed0328e425fc
|