No project description provided

These details have not been verified by PyPI

Reason this release was yanked:

outdated

Project description

Classifier Toolkit

This is a new project.

Table of Content

Installation
Usage
Modules Overview
Future Work

Installation

This library is published in the PyPI directory. To install, users can run pip install 'classifier_toolkit' command.

Usage

This library automates binary classification tasks in the finance domain, specifically for default and fraud labeling. It includes several packages designed to address the main steps in any machine learning/data science task:

EDA: which is accessible by EDA_Toolkit. This package provides the EDA and feature engineering functionality alongside with all the necessary visualizations.
Feature Selection: To be implemented.
Model fitting and hyperparameter tuning: To be implemented.
Evaluation and reporting: To be implemented.

In the future, the package architectures will be included here. However, for now please consult the docstrings in the specific methods in the relevant modules.

Note: that this library does not contain data wrangling steps (although it contains feature engineering), it's an intermediate step between EDA and feature engineering where users should fix any data quality related issues. Therefore, conducting the EDA is crucial to mitigate any issues before moving onto the feature engineering and the subsequent steps.

Modules Overview

EDA Toolkit: This module includes classes and methods for performing comprehensive exploratory data analysis. It provides automated warnings for data quality issues, univariate and bivariate analysis, and various data visualizations to help understand the dataset.
Univariate Analysis: This class focuses on the analysis of individual variables. It includes methods for calculating statistical measures, visualizing distributions, and assessing relationships between variables and a target through techniques like Cramer's V and Information Value. This helps in understanding the significance and distribution of each feature independently.
Bivariate Analysis: This class deals with the analysis of two variables to understand their relationship. It includes functionalities for generating correlation heatmaps, performing ANOVA tests between numerical and categorical variables, and computing pairwise Cramer's V for categorical features. This aids in identifying patterns and correlations between pairs of variables, which is crucial for feature selection and engineering.
Feature Engineering: This module assists in transforming features, handling missing values, encoding categorical variables, and more. It aims to enhance the dataset's quality for better model performance.
Visualizations: This module offers a wide range of plotting capabilities to visually analyze data distributions, relationships, and other crucial aspects of the dataset.
Automated Warnings: A utility to automatically check the dataset for common issues such as missing or duplicate values, outliers, and more, providing warnings to guide data cleaning efforts.
Feature Selection: This module provides various feature selection techniques:
- Embedded Methods: Includes ElasticNet for regularization-based feature selection.
- Wrapper Methods:
  - Recursive Feature Elimination (RFE) with support for various ensemble methods (Random Forest, XGBoost, LightGBM, CatBoost).
  - Sequential Feature Selection (forward, backward, floating, and bidirectional).
- Meta Selector: Combines multiple feature selection methods to provide a robust selection.
- Utility Functions: Includes scoring functions and plotting utilities for feature importance visualization.

Future Work

The next planned improvements and additions to the library include:

Adding model fitting and hyperparameter tuning functionalities.
Developing comprehensive evaluation and reporting tools to assist with model assessment.
Expanding documentation to include architecture diagrams and detailed usage examples.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.2

Aug 13, 2025

0.2.1 yanked

Jan 6, 2025

Reason this release was yanked:

outdated

This version

0.2.0 yanked

Nov 13, 2024

Reason this release was yanked:

outdated

0.1.4 yanked

Sep 19, 2024

Reason this release was yanked:

outdated

0.1.0 yanked

Sep 13, 2024

Reason this release was yanked:

outdated

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

classifier_toolkit-0.2.0.tar.gz (59.4 kB view details)

Uploaded Nov 13, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

classifier_toolkit-0.2.0-py3-none-any.whl (71.3 kB view details)

Uploaded Nov 13, 2024 Python 3

File details

Details for the file classifier_toolkit-0.2.0.tar.gz.

File metadata

Download URL: classifier_toolkit-0.2.0.tar.gz
Upload date: Nov 13, 2024
Size: 59.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/24.1.0

File hashes

Hashes for classifier_toolkit-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e1e8c9283bb8c554fb28dd2d23e243070df25ab529b404c849bc8e539aa35d4c`
MD5	`ed24317c8ed2c1e370a1c4d81d00573d`
BLAKE2b-256	`7ebf9bb065abb68e29ea1303d0c18b6395cf9d038def2c77d7582059cf4e134a`

See more details on using hashes here.

File details

Details for the file classifier_toolkit-0.2.0-py3-none-any.whl.

File metadata

Download URL: classifier_toolkit-0.2.0-py3-none-any.whl
Upload date: Nov 13, 2024
Size: 71.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/24.1.0

File hashes

Hashes for classifier_toolkit-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cdb31cbd0f2dcabe9654a4a81cd2c87016cb2ae16cd090dff0d5f8172a1be5e3`
MD5	`6bdc978435420ddd4f4cfc2655fab6a9`
BLAKE2b-256	`92ddbc7ed60223e5c53792a4f9ee9a4844cb1095b9773dca79838992c1e38506`

See more details on using hashes here.

classifier-toolkit 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Classifier Toolkit

Table of Content

Installation

Usage

Modules Overview

Future Work

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes