Write a python function to calculate your metric(s), and run it over all clusters in your data. Find the cuts of the data driving your metrics.

Project description

hot-spot-analysis

Hot Spot Analysis (HSA) is an analytic reporting framework designed to identify key drivers behind metric movements by analyzing data cuts across multiple features. This tool enhances reporting, uncovers insights, and simplifies understanding of why metrics shift. It automatically processes all viable combinations within the data, offering a structured output for deeper analysis. Future updates aim to improve speed and functionality. HSA is particularly useful for analysts looking to simplify complex data interactions and trends.

https://hot-spot-analysis-demo.streamlit.app/ has a working demo of HSA on a select set of datasets.

Installation

pip install hot-spot-analysis

Python Import

from hot_spot_analysis.hot_spot_analysis import HotSpotAnalyzer
HSA = HotSpotAnalyzer(...)

Quickstart

Short Theoretical Demonstration:

If we have 3 columns [a, b, c], and we want to cut our data using those columns we would have to group our data as such to know all of the interactions' impact on our metric of interest. And this problem becomes increasingly complicated as we increase the number of columns.

Interacting 3 columns: [a, b, c] -> 7 valid data cuts

@ depth = 1: [a,b,c] <- 3 data cuts
@ depth = 2: [ab,ac,bc] <- 3 data cuts
@ depth = 3: [abc] <- 1 data cuts

A simple example of Hot Spot Analysis (HSA)

Example - Input Data

column1	column2	Value
A	X	10
A	Y	20
B	X	30
B	Y	40
C	X	50
C	Y	60

Example - Simple metric function

# Metric function
def metric_function(group):
    return {
        'sum_value': group['Value'].sum()
    }

Example Run HSA

from hot_spot_analysis.hot_spot_analysis import HotSpotAnalyzer

HSA = HotSpotAnalyzer(
    data=example_data,                  # See above
    target_cols=["column1", "column2"], 
    objective_function=metric_function, # See above
)

HSA.run_hsa()
hsa_data = HSA.export_hsa_output_df()
print(hsa_data.head(10))

Below is a simplified example of the HSA output

group	n_rows	sum_value
{'column1': 'A'}	2	30
{'column1': 'B'}	2	70
{'column1': 'C'}	2	110
{'column2': 'X'}	3	90
{'column2': 'Y'}	3	120
{'column1': 'A', 'column2': 'X'}	1	10
{'column1': 'A', 'column2': 'Y'}	1	20
{'column1': 'B', 'column2': 'X'}	1	30
{'column1': 'B', 'column2': 'Y'}	1	40
{'column1': 'C', 'column2': 'X'}	1	50
{'column1': 'C', 'column2': 'Y'}	1	60

Project details

Release history Release notifications | RSS feed

This version

1.0.4.1

Jul 14, 2024

1.0.4

Jul 7, 2024

1.0.4b0 pre-release

Jul 7, 2024

1.0.4a0 pre-release

May 28, 2024

1.0.2

May 19, 2024

0.1.4

Feb 28, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hot_spot_analysis-1.0.4.1.tar.gz (14.9 kB view details)

Uploaded Jul 14, 2024 Source

Built Distribution

hot_spot_analysis-1.0.4.1-py3-none-any.whl (13.0 kB view details)

Uploaded Jul 14, 2024 Python 3

File details

Details for the file hot_spot_analysis-1.0.4.1.tar.gz.

File metadata

Download URL: hot_spot_analysis-1.0.4.1.tar.gz
Upload date: Jul 14, 2024
Size: 14.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for hot_spot_analysis-1.0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`6126d05bfc80df22aab81048b69a54ba2d040046a4189fb9fd6fb8589e403612`
MD5	`f0452d4f9f6eecb604a2548c100cb726`
BLAKE2b-256	`ab4cbe08e7356c75300ba1064bdba8be52511e622ddf46e7f8d166606cad1d0a`

See more details on using hashes here.

File details

Details for the file hot_spot_analysis-1.0.4.1-py3-none-any.whl.

File metadata

Download URL: hot_spot_analysis-1.0.4.1-py3-none-any.whl
Upload date: Jul 14, 2024
Size: 13.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for hot_spot_analysis-1.0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6961dfe30da58d8dcfdae2dc1e85451a615d5fd6d46ecdd457a15a988761f972`
MD5	`f17b59c35f68798374ab207920749810`
BLAKE2b-256	`6f9f9cb4ed6406b1449c69f7bb069a1dfe7a4a1e71ea1bc17e474df6083356db`