Skip to main content

Collection of useful modules that can assist in the process of data preparation

Project description

tacklepy module, version 1.0.1, is specifically designed to simplify the process of data preparation!

DataImputer is a Python module designed to handle missing values in datasets by predicting and imputing those missing values. It provides a convenient and user-friendly interface for automating the process of handling missing data and enhancing the completeness of datasets.

This module offers various functionalities for imputing numerical and categorical columns separately. It employs machine learning algorithms such as HistGradientBoosting, XGBoost, and CatBoost to predict missing values based on highly correlated features. The choice of the algorithm for predicting NaNs is customizable, allowing users to select the most suitable approach for their specific needs.

One of the key features of the DataImputer module is its ability to handle outliers in the data before performing imputation. By identifying and addressing outliers, the module ensures more accurate imputation results.

DataImputer supports a wide range of tasks, including binary classification, multi-class classification, and regression. The type of column being imputed determines the specific task performed. The module provides options to exclude specific columns from the imputation process, control verbosity to receive informative output during execution, and define the size of the training set for the prediction models.

Installation:

$ pip install tacklepy

$ pip install --upgrade tacklepy

Dependencies

DataImputer-code requires:

  • Python (_version_ >= 3.6)

  • Pandas (_version_ >= 2.0.2)

  • Numpy (_version_ >= 1.23.5)

  • XGBoost (_version_ >= 1.7.5)

  • CatBoost (_version_ >= 1.2)

  • Scikit-learn (_version_ >= 1.2.2)

  • Scipy (_version_ >= 1.10.1)

Development

At TacklePy, we value diversity and inclusivity in our community of contributors. Whether you're a seasoned developer or just starting out, we welcome you to join us in building a more helpful and effective platform. Our Development Guide provides comprehensive information on how you can contribute to our project through code, documentation, testing, and more. Take a look and see how you can get involved!

Important links

Source code

You can check the latest sources with the command:

git clone https://github.com/NikitaRomanov-ds/tacklepy.git

Submitting a Pull Request

Before opening a Pull Request, have a look at the full Contributing page to make sure your code complies with our guidelines: https://scikit-learn.org/stable/developers/index.html

Communication

Citation

If you use PyChatAi in a media/research publication, we would appreciate citations to the following: paper/profile/website/etc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tacklepy-1.0.1.tar.gz (3.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page