Skip to main content

Package that will clean the data, do basic EDA and provide an insight to basic models, LR and ridge

Project description

simplefit

ci-cd Documentation Status codecov

A python package that cleans the data, does basic EDA and returns scores for basic classification and regression models

Overview

This package helps data scientists to clean the data, perform basic EDA, visualize graphical interpretations and analyse performance of the baseline model and basic Classification or Regression models, namely Logistic Regression, Ridge on their data.

Functions


Function Name Input Output Description
cleaner dataframe list of 3 dataframes Loads and cleans the dataset, removes NA rows, strip extra white spaces, etc and returns clean dataframe
plot_distributions dataframe, bins, dist_cols, class_label Altair histogram plot object creates numerical distribution plots on either all the numeric columns or the ones provided to it
plot_corr dataframe, corr Altair correlation plot object creates correlation plot for all the columns in the dataframe
plot_splom dataframe, pair_cols Altair SPLOM plot object creates SPLOM plot for all the numeric columns in the dataframe or the ones passed by the user
regressor train_df, target_col, numeric_feats, categorical_feats, text_col, cv dataframe Preprocesses the data, fits baseline model(Dummy Regressor) and Ridge with default setup and returns model scores in the form of a dataframe
classifier train_df , target_col , numeric_feats , categorical_feats , text_col , cv dataframe Preprocesses the data, fits baseline model(Dummy Classifier) and Logistic Regression with default setup and returns model scores in the form of a dataframe

Our Package in the Python Ecosystem


There exists a subset of our package as standalone packages, namely auto-eda, eda-report, quick-eda, s11-classifier. But these packages only do the EDA or just the classification using XGBoostClassifier. But with our package, we aim to do all the basic steps of a ML pipeline and save the data scientist's time and effort by cleaning, preprocessing, returning grpahical visualisations from EDA and providing an insight about the basic model performances, after which the user can decide which other models to use.

Installation

$ pip install git+https://github.com/UBC-MDS/simplefit

Usage

Documentation Status

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

Contributors

This python package was developed by the following Master of Data Science program candidates at the University of the British Columbia:

License

simplefit was created by Reza Zoe Navya Sanchit. It is licensed under the terms of the MIT license.

Credits

simplefit was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simplefit-0.1.5.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

simplefit-0.1.5-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file simplefit-0.1.5.tar.gz.

File metadata

  • Download URL: simplefit-0.1.5.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for simplefit-0.1.5.tar.gz
Algorithm Hash digest
SHA256 f065699e0af60c6ae60ad1139c8573d1d5d5439e7b3be74c7355d36b409db55d
MD5 4ac1590a7b328b63e8c48660ea98fb09
BLAKE2b-256 d8f4748aa9f6c2526394c479898869767915558e20dad01a19b9bc151af6262e

See more details on using hashes here.

File details

Details for the file simplefit-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: simplefit-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for simplefit-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 4743a32d7cf490881c92dae248e8e49697260ae75542e2e039f6ca37f41e93a3
MD5 09dc124f0366d9ab827dc45b6ce80107
BLAKE2b-256 0f1d0011f691651fcbfb71fd9385766a290c2392670377a152bff4a915935d24

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page