Skip to main content

A simple data cleansing tool using pandas and Machine learning models

Project description

datarun

datarun is a lightweight Python package that helps you cleanse your pandas DataFrames with minimal configuration.
It supports automatic handling of duplicates, missing values, constant columns, and type conversion.

Features

  • Drop duplicate rows
  • Handle missing values using mean, median, mode, or drop
  • Drop constant-value columns
  • Convert string-based numeric columns to proper types
  • Configurable and simple to use

Example: Linear Regression

from datarun import LinearRegressionCustom
import pandas as pd

data = pd.read_csv("Salary_dataset.csv")
X = data[['YearsExperience']]
y = data['Salary']

model = LinearRegressionCustom(method='gradient_descent', learning_rate=0.01, epochs=5000)
model.fit(X, y)
preds = model.predict(X)
print(model.get_params())

## Installation

pip install datarun

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datarun-0.2.6.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datarun-0.2.6-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file datarun-0.2.6.tar.gz.

File metadata

  • Download URL: datarun-0.2.6.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for datarun-0.2.6.tar.gz
Algorithm Hash digest
SHA256 fac2f18407c869dac8674543eb6c96411eb65dd16db3ea38271a078efb3a3f0d
MD5 48bcb0e60a176dc3225e7ce14d8bb368
BLAKE2b-256 f42d6f90ef9124a404b1cbc243a4d29ac5747cfb0d0a6893665ee9bb8dc631bc

See more details on using hashes here.

File details

Details for the file datarun-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: datarun-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for datarun-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 df2b4273fbb5f0ad64c00eb52c4f0ab914d10b80ae6273a7801c6795eb3494e2
MD5 e803168056588529130595ff2082ce0a
BLAKE2b-256 10bd96056d0db896019ab9007d346d07778e43ecbc107429f94587aa59a188e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page