Skip to main content

Modular churn modeling pipelines.

Project description

Churn Modeling Pipelines

PyPI version Python >= 3.8 License: MIT

A modular, extensible, and production-ready Python package for end-to-end churn prediction and customer analytics. This package supports structured preprocessing, feature engineering, model building, evaluation, visualization, ensemble modeling, and exploratory data analysis.


📦 Module Overview

Module Description
ChurnPreprocessor Prepares and standardizes raw churn data (e.g., encoding, type fixing, imputations).
DataPreprocessor Additional utility class for transforming and cleaning structured datasets.
ChurnModelBuilder Builds 5 hyperparameter variants each for Logistic Regression, Decision Tree, KNN, Naive Bayes, SVM, Random Forest, XGBoost, LightGBM, and CatBoost.
ChurnEvaluator Computes evaluation metrics (Accuracy, Precision, Recall, F1, Cost), and supports confusion matrix, ROC, and model comparison.
ChurnPlotter Visualizes performance results, including composite scores, cost sensitivity, and radar charts for base and ensemble models.
ModelComparator Compares all models across cost, recall, and composite score to select best-performing variants.
EnsembleBuilder Builds ensemble models (Voting, Stacking, Bagging, Boosting, Blending, etc.) with 5 tuned variants per type.
CustomerJourneyClassifier Segments users based on tenure, support interactions, and satisfaction into customer journey stages.
EDAHelper Unified EDA interface that wraps profiling, visualization, and hypothesis testing.
EDAReports Provides data profiling reports (missing values, types, summaries).
EDAPlots Generates univariate, bivariate, multivariate plots with annotations.
ChurnHypothesisTester Performs statistical tests to validate churn-related hypotheses.

🚀 Quick Start

import pandas as pd
from churn_modeling_pipelines import (
    ChurnPreprocessor,
    DataPreprocessor,
    ChurnModelBuilder,
    ChurnEvaluator,
    ChurnPlotter,
    EnsembleBuilder,
    ModelComparator,
    CustomerJourneyClassifier,
    EDAHelper
)

# Load your dataset
df = pd.read_csv("customer_data.csv")

# Step 1: Preprocess
pre = ChurnPreprocessor(df)
processed_df = pre.full_pipeline()

# Step 2: Explore (EDA)
eda = EDAHelper(processed_df)
eda.reports.data_profile()
eda.plots.univariate_numeric("RevPerMonth")
eda.hypothesis.test_churn_hypotheses_stats()

# Step 3: Train/Test Split
from sklearn.model_selection import train_test_split
X = processed_df.drop("Churn", axis=1)
y = processed_df["Churn"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)

# Step 4: Model Building
builder = ChurnModelBuilder(X_train, X_test, y_train)
lr_models = builder.build_logistic_regression()

# Step 5: Evaluation
evaluator = ChurnEvaluator()
results = evaluator.evaluate_models(lr_models, X_test, y_test)

# Step 6: Plotting
plotter = ChurnPlotter()
plotter.plot_model_comparison(results)

# Step 7: Ensemble Models
ensemble = EnsembleBuilder(X_train, X_test, y_train)
stacking_models = ensemble.build_stacking()

# Step 8: Compare All
comparator = ModelComparator()
summary = comparator.generate_model_summary(results + stacking_models)


!pip install churn-modeling-pipelines


🧠 Requirements
Python 3.8 or higher

pandas

numpy

matplotlib

seaborn

scikit-learn

xgboost

lightgbm

catboost

scipy

🪪 License
MIT License © 2025
Developed and maintained by John Ebikake

💬 Contact
For issues, suggestions, or contributions:

GitHub: github.com/Ebikake/churn-modeling-pipelines

Email: ebikakejay@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

churn_modeling_pipelines-0.2.21-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file churn_modeling_pipelines-0.2.21-py3-none-any.whl.

File metadata

File hashes

Hashes for churn_modeling_pipelines-0.2.21-py3-none-any.whl
Algorithm Hash digest
SHA256 40b251ce33667dc0f808529577ea2920e6b2291b063d2d90b2ff9a6ec619bf4b
MD5 70c7040bdf3de7119c5b10459064db1b
BLAKE2b-256 d75e4cfccbfb7d71e06f5a5bdbb0e4667d9ec53544439d09dddeac9034fd3089

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page