Skip to main content

Automatic Discretization of Features with Optimal Target Association

Project description

AutoCarver Logo

PyPI PyPI - Python Version License Pytest Status Documentation Status

ReadTheDocs

Check out the package documentation on ReadTheDocs!

Install

AutoCarver can be installed from PyPI:

pip install autocarver

Why AutoCarver?

AutoCarver is a powerful Python package designed to address the fundamental question of What's the best processing for my model's features?

It offers an automated and optimized approach to processing and engineering your data, resulting in improved model performance, enhanced explainability, and reduced feature dimensionality. As of today, this set of tools is available for binary classification and regression problems only.

Key Features:

  1. Data Processing and Engineering: AutoCarver performs automatic bucketization and carving of a DataFrame's columns to maximize their correlation with a target variable. By leveraging advanced techniques, it optimizes the preprocessing steps for your data, leading to enhanced predictive accuracy.

  2. Improved Model Explainability: AutoCarver aids in understanding the relationship between the processed features and the target variable. By uncovering meaningful patterns and interactions, it provides valuable insights into the underlying data dynamics, enhancing the interpretability of your models.

  3. Reduced Feature Dimensionality: AutoCarver excels at reducing feature dimensionality, especially in scenarios involving one-hot encoding. It identifies and preserves only the most statistically relevant modalities, ensuring that your models focus on the most informative aspects of the data while eliminating noise and redundancy.

  4. Statistical Accuracy and Relevance: AutoCarver incorporates statistical techniques to ensure that the selected modalities have a sufficient number of observations, minimizing the risk of drawing conclusions based on insufficient data. This helps maintain the reliability and validity of your models.

  5. Robustness Testing: AutoCarver goes beyond feature processing by assessing the robustness of the selected modalities. It performs tests to evaluate the stability and consistency of the chosen features across different datasets or subsets, ensuring their reliability in various scenarios.

AutoCarver is a valuable tool for data scientists and practitioners involved in binary classification or regression problems, such as credit scoring, fraud detection, and risk assessment. By leveraging its automated feature processing capabilities, you can unlock the full potential of your data, leading to more accurate predictions, improved model explainability, and better decision-making in your classification tasks.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autocarver-6.0.5.tar.gz (51.6 kB view details)

Uploaded Source

Built Distribution

autocarver-6.0.5-py3-none-any.whl (68.3 kB view details)

Uploaded Python 3

File details

Details for the file autocarver-6.0.5.tar.gz.

File metadata

  • Download URL: autocarver-6.0.5.tar.gz
  • Upload date:
  • Size: 51.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for autocarver-6.0.5.tar.gz
Algorithm Hash digest
SHA256 30f5e98af643a0cd7435df393e006224b80d4fe227974c67a1c76b8c721c1113
MD5 1b4be1c88670d38de4862f62dfa097df
BLAKE2b-256 1a486542af95796e3d80ef9a1a8c78b8cd31146a343d39a9fd173125d33fcd5e

See more details on using hashes here.

File details

Details for the file autocarver-6.0.5-py3-none-any.whl.

File metadata

  • Download URL: autocarver-6.0.5-py3-none-any.whl
  • Upload date:
  • Size: 68.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for autocarver-6.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 69f875b889170abe5178e72f1679f05bf450ae45a66bd8e4d97e024030e68194
MD5 c01da18fc9081e2ace3d1006191d955c
BLAKE2b-256 6b0eb02ec0c3bbf314c7108649586a44d317e4e1b3fdef5b22d2eea68c9f0b15

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page