A beginner-friendly Python Package to clean, classify, and visualize CSV data via a simple GUI.
Project description
cleanclassify
cleanclassify is a beginner-friendly Python package that helps you clean, classify, and visualize CSV data — all from a sleek graphical interface. Whether you're a student, data science enthusiast, or someone exploring machine learning for the first time, this tool simplifies your journey.
What It Does
-
Cleans your dataset automatically
- Handles missing values
- Drops problematic or high-cardinality columns
- Scales numeric features
- Encodes categorical variables
-
Runs machine learning models
- Trains and evaluates Logistic Regression, Random Forest, and Support Vector Classifier using
scikit-learn
- Trains and evaluates Logistic Regression, Random Forest, and Support Vector Classifier using
-
Visualizes model performance
- Shows accuracy, precision, recall, and F1-score
- Highlights the best-performing model
- Plots a clean bar chart using
matplotlib
-
Requires zero coding
- Just load your CSV, pick the target column, and click the clean , classify buttons — that’s it!
Installation
Install it directly from PyPI:
pip install cleanclassify
This will automatically install required dependencies:
pandas,numpy,scikit-learn, andmatplotlib.
How to Use
Launch the GUI with:
python -m cleanclassify
Or if you're using the CLI script (after setup with console entry):
cleanclassify
💻 Example Workflow
- Launch the app.
- Browse and load your CSV file.
- Select the target column you want to predict.
- Click Run Cleaning to clean and prepare your dataset.
- Click Run Classification to train and evaluate models.
- View detailed metrics and a comparison chart of model performance.
What Your Data Should Look Like
- Must contain a target column (the label you're predicting).
- Can include both numeric and categorical features.
- Should not include long text or extremely high-cardinality columns (they’ll be automatically dropped for performance).
- If the dataset has more than 2000 rows, it will be automatically downsampled for memory efficiency.
Under the Hood
cleaner.py— Preprocesses data: cleans, encodes, scales, and downsamples.classify.py— Trains and evaluates three ML models.gui.py— A simple but powerful GUI built withtkinter.
👤 Author
Crafted with ❤️ by Safa Mahveen
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cleanclassify-0.1.tar.gz.
File metadata
- Download URL: cleanclassify-0.1.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
048de24ebaf18ddd5aeb47fba2e96b91f431298a4010cb388d026e1fc863ff6f
|
|
| MD5 |
db02bcf4041ab0dd13e4e88270e59454
|
|
| BLAKE2b-256 |
cee9aa475588698f22c73ebce3e4b52fc6f6ba269c0cedb47a98f59458ddb4d6
|
File details
Details for the file cleanclassify-0.1-py3-none-any.whl.
File metadata
- Download URL: cleanclassify-0.1-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ea23bc3565ae0c0f4e49c492206f147817349597dd28587d2c3f38f6aec418b
|
|
| MD5 |
ed4bb842b20d62906e5cb2bf088b61cc
|
|
| BLAKE2b-256 |
2ab489e5140172967f6c3d9dffb648380178b749ad2483bfe614ae0a46ae7892
|