Skip to main content

Survival Analysis: Customer Churn and CLV Prediction

Project description

# Survival Analysis Package

## Overview

The Survival Analysis package is a Python toolkit for analyzing and predicting customer churn and lifetime value using survival analysis techniques. This package encompasses several modules that cover database schema creation, SQL interactions, predictive modeling, and utility functions for data preprocessing.

## Installation

`python pip install survival-analysis ` You can access our package via PyPi using this link: https://pypi.org/project/survival-analysis/0.0.1/

## Documentation

Detailed information about our package can be found at https://anna-shaljyan.github.io/mkdocs-survival-analysis/?fbclid=IwAR2Kxzv_bs3WhMpGeU9jP0lKwvQ-sGPK_EG4ualMhqPFglEX9Nhoo8bE8N0

## Modules

### 1. schema.py

#### Module Description:

This module, schema.py, contains Python code for defining and creating a database schema using SQLAlchemy. It defines tables such as ‘DimCustomer’, ‘FactPredictions’, ‘FactPushNotification’, and ‘FactEmail’ for storing customer information, predictive data, push notification details, and email information, respectively.

`python from survival_analysis import schema `

The obtained databse has the below structure: ![Database ERD](survival_analysis/docs/ERD.jpg)

### 2. sql_interactions.py

#### Module Description:

The sql_interactions module provides a Python class named SqlHandler for interacting with SQLite databases. This class allows various operations such as connecting, inserting data, retrieving data, truncating tables, dropping tables, updating tables, and more.

`python from survival_analysis import sql_interactions `

### 3. model_AFT.py

#### Module Description:

The model_AFT module implements an Accelerated Failure Time (AFT) model for predicting customer churn and lifetime value. It includes classes for different AFT models, a model selector for choosing the best model based on AIC, and methods for fitting the model and generating predictions.

`python from survival_analysis import model_AFT `

### 4. utils.py

#### Module Description:

The utils module contains utility functions, including format_dataframe, which converts categorical variables to binary columns using one-hot encoding and ensures correct data types for numeric variables.

`python from survival_analysis import utils ` ## Example Usage

An example demonstrating the use of this package can be found at https://github.com/ella-2002e/MA-SurvivalAnalysis-Project/blob/main/Example.ipynb

# API

The API extends our Survival Analysis project with functionality to select data from the database and insert data. The API now includes several endpoints that help identify top customers with the highest churn rate, customers with the highest/lowest CLV, etc.

## Usage

Run run.py to see initially a message in port, add /docs to see put methods and two get endpoints besides message. Port should look something like this: http://127.0.0.1:8000/docs#/ . You can run run.py by executing python run.py in your terminal in venv.

## ENDPOINTS

### GET

#### 1. get_top_churn_clv_customers - Accepts pred_period and number of percentage for sorting customers initially by churn_rate and then by clv. It returns top x% customers based on churn_rate & CLV.

#### 2. get_top_clv_customers - Accepts pred_period and number of percentage for sorting customers by CLV. It returns top x% customers based on CLV.

### PUT

These below PUT methods are created to populate the DB with the results of actions taken in response to the two GET methods mentioned above. There are two csv files email_data.csv and notifications_data.csv in Raw Data folder that contain sample generated data with structure that matches tables of the database.

#### 1. populate_fact_push_notification

#### 2. populate_fact_email

## License This package is provided under the MIT License. Feel free to use and modify it in your projects.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

survival_analysis-0.0.3.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

survival_analysis-0.0.3-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file survival_analysis-0.0.3.tar.gz.

File metadata

  • Download URL: survival_analysis-0.0.3.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for survival_analysis-0.0.3.tar.gz
Algorithm Hash digest
SHA256 c091a661759069ebe4527b716eaed29c25f36666b7f993fc790bd3ea5589ebff
MD5 5984d75a55d9e681cfe1b30ed540f816
BLAKE2b-256 c48895b3895e07a8b00b58440aecf4d613130021d7962143d70e383c8ed9cafb

See more details on using hashes here.

File details

Details for the file survival_analysis-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for survival_analysis-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ed7417591aceaf1eada135e75f988a813e0157cff189ece5928d295a8168e651
MD5 495a49f7ee40c068046d8af98c0761ac
BLAKE2b-256 55d01142e791226cb726d2fdedd55dabcc8f7e461ce7bcfcb9c7e90ac9c9f3f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page