Simple DataFrame cleaning toolkit
Project description
dfcleanerpro
A lightweight and efficient DataFrame preprocessing library designed for modern data workflows.
Installation
pip install dfcleanerpro
Key Capabilities
- Automated data cleaning pipeline
- Intelligent handling of missing values
- Standardized column name formatting
- Duplicate record elimination
- Removal of low-information columns
- String normalization for text fields
- Simple and intuitive API design
Getting Started
Basic Usage
import pandas as pd
from dfcleanerpro import DataCleaner
df = pd.read_csv("data.csv")
cleaned_df = DataCleaner(df).run_all()
Example Transformation
data = {
"Name ": ["Alice", "Bob", "Bob", "None"],
"Age": [25, None, 25, 30],
"City": [" Chennai", "Delhi ", "Delhi ", None],
"Constant": [1,1,1,1]
}
df = pd.DataFrame(data)
cleaned = DataCleaner(df).run_all()
print(cleaned)
What It Handles
| Task | Description |
|---|---|
| Missing Values | Replaces nulls using smart strategies |
| Column Formatting | Converts names to clean snake_case |
| Duplicate Rows | Identifies and removes duplicates |
| Constant Columns | Drops columns with no variance |
| String Cleanup | Removes unwanted whitespace |
Design Philosophy
Data preprocessing is a repetitive but critical step in any data workflow. This library focuses on:
- Simplicity over complexity
- Clean and readable transformations
- Reusability across projects
Built With
- Python
- Pandas
- NumPy
Use Cases
- Data Engineering pipelines
- Data Science preprocessing
- Exploratory Data Analysis (EDA)
- Machine Learning data preparation
Future Enhancements
- CLI support for CSV processing
- Data validation rules
- Outlier detection utilities
- Data profiling reports
- Integration with big data tools
Contributions
Contributions, issues, and feature requests are welcome!
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dfcleanerpro-0.2.3.tar.gz
(3.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dfcleanerpro-0.2.3.tar.gz.
File metadata
- Download URL: dfcleanerpro-0.2.3.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54523d96f8121323c1098ba48da157b44fced2eb8631b7d94ce1a970d15d4f86
|
|
| MD5 |
746c107571f82890c59bdbcc25ab1db0
|
|
| BLAKE2b-256 |
a874495b8cbef9bc1aaafa810b3f7b4053399e949b772894c68bc874d15b1c97
|
File details
Details for the file dfcleanerpro-0.2.3-py3-none-any.whl.
File metadata
- Download URL: dfcleanerpro-0.2.3-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f24eece2c07c8b25924899970945114e284971ffdc12c4177fb4640c06da7090
|
|
| MD5 |
50c294728306c7b403ab7b9a930b7392
|
|
| BLAKE2b-256 |
9247f74eec5b895c5dd9b6718c3be45566205e1a23948f214b12b33f11f4e2d6
|