Skip to main content

Collection of useful modules that can assist in the process of data preparation

Project description

tacklepy module, version 1.0.1, is specifically designed to simplify the process of data preparation!

DataImputer is a Python module designed to handle missing values in datasets by predicting and imputing those missing values. It provides a convenient and user-friendly interface for automating the process of handling missing data and enhancing the completeness of datasets.

This module offers various functionalities for imputing numerical and categorical columns separately. It employs machine learning algorithms such as HistGradientBoosting, XGBoost, and CatBoost to predict missing values based on highly correlated features. The choice of the algorithm for predicting NaNs is customizable, allowing users to select the most suitable approach for their specific needs.

One of the key features of the DataImputer module is its ability to handle outliers in the data before performing imputation. By identifying and addressing outliers, the module ensures more accurate imputation results.

DataImputer supports a wide range of tasks, including binary classification, multi-class classification, and regression. The type of column being imputed determines the specific task performed. The module provides options to exclude specific columns from the imputation process, control verbosity to receive informative output during execution, and define the size of the training set for the prediction models.

Installation:

$ pip install tacklepy

$ pip install --upgrade tacklepy

Dependencies

DataImputer-code requires:

  • Python (_version_ >= 3.6)

  • Pandas (_version_ >= 2.0.2)

  • Numpy (_version_ >= 1.23.5)

  • XGBoost (_version_ >= 1.7.5)

  • CatBoost (_version_ >= 1.2)

  • Scikit-learn (_version_ >= 1.2.2)

  • Scipy (_version_ >= 1.10.1)

Development

At TacklePy, we value diversity and inclusivity in our community of contributors. Whether you're a seasoned developer or just starting out, we welcome you to join us in building a more helpful and effective platform. Our Development Guide provides comprehensive information on how you can contribute to our project through code, documentation, testing, and more. Take a look and see how you can get involved!

Important links

Source code

You can check the latest sources with the command:

git clone https://github.com/NikitaRomanov-ds/tacklepy.git

Submitting a Pull Request

Before opening a Pull Request, have a look at the full Contributing page to make sure your code complies with our guidelines: https://scikit-learn.org/stable/developers/index.html

Communication

Citation

If you use PyChatAi in a media/research publication, we would appreciate citations to the following: paper/profile/website/etc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tacklepy-1.0.1.tar.gz (3.8 kB view details)

Uploaded Source

File details

Details for the file tacklepy-1.0.1.tar.gz.

File metadata

  • Download URL: tacklepy-1.0.1.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for tacklepy-1.0.1.tar.gz
Algorithm Hash digest
SHA256 198708779581a60d0dec164be9b81e66ed9775067aa419e3dc730b05abf986f5
MD5 89c0e222f4aa6b5674628a66f504f05b
BLAKE2b-256 b83ea0aee21d877fab5d1c3f84c04ed51b9884dfd28c3ddd2be696c2dc045b83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page