A simple data cleansing tool using pandas and Machine learning models
Project description
datarun
datarun is a lightweight Python package that helps you cleanse your pandas DataFrames with minimal configuration.
It supports automatic handling of duplicates, missing values, constant columns, and type conversion.
Features
- Drop duplicate rows
- Handle missing values using mean, median, mode, or drop
- Drop constant-value columns
- Convert string-based numeric columns to proper types
- Configurable and simple to use
Example: Linear Regression
from datarun import LinearRegressionCustom
import pandas as pd
data = pd.read_csv("Salary_dataset.csv")
X = data[['YearsExperience']]
y = data['Salary']
model = LinearRegressionCustom(method='gradient_descent', learning_rate=0.01, epochs=5000)
model.fit(X, y)
preds = model.predict(X)
print(model.get_params())
## Installation
pip install datarun
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datarun-0.2.6.tar.gz
(5.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datarun-0.2.6.tar.gz.
File metadata
- Download URL: datarun-0.2.6.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fac2f18407c869dac8674543eb6c96411eb65dd16db3ea38271a078efb3a3f0d
|
|
| MD5 |
48bcb0e60a176dc3225e7ce14d8bb368
|
|
| BLAKE2b-256 |
f42d6f90ef9124a404b1cbc243a4d29ac5747cfb0d0a6893665ee9bb8dc631bc
|
File details
Details for the file datarun-0.2.6-py3-none-any.whl.
File metadata
- Download URL: datarun-0.2.6-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df2b4273fbb5f0ad64c00eb52c4f0ab914d10b80ae6273a7801c6795eb3494e2
|
|
| MD5 |
e803168056588529130595ff2082ce0a
|
|
| BLAKE2b-256 |
10bd96056d0db896019ab9007d346d07778e43ecbc107429f94587aa59a188e2
|