Skip to main content

This package helps to find the overlap percentage of two probability distributions.

Project description

PDistMap

DOI

This package calculates the overlap percentage between two probability distributions, offering extensive applications in both academic and industrial settings. For instance, in multiple iterations of machine learning clustering, the core algorithm may change the cluster number or name, making it challenging for the end user to map the clusters accurately.

Example Use Cases:

  • Machine Learning Clustering: In scenarios where multiple iterations of clustering algorithms are performed, the cluster identifiers may change, making it difficult to track and compare clusters across iterations. This package helps in mapping and comparing clusters by calculating the overlap percentage between the distributions of cluster assignments. For example, if a data scientist is running a k-means clustering algorithm multiple times, the cluster labels might change in each iteration. By using this package, they can measure the overlap between the clusters from different iterations and ensure consistency in their analysis.

  • Anomaly Detection: The package can be used to compare the distribution of data points in normal and anomalous conditions, helping in identifying and quantifying the extent of anomalies. For instance, in a network security application, the distribution of network traffic under normal conditions can be compared with the distribution during a suspected attack. The overlap percentage can help quantify the deviation and identify potential security breaches.

  • Quality Control: In manufacturing and quality control processes, the package can be used to compare the distribution of measurements from different batches or production runs, ensuring consistency and identifying deviations. For example, a quality control engineer can compare the distribution of product dimensions from two different production runs to ensure that they meet the required specifications and identify any deviations that need to be addressed.

  • Market Research: The package can be applied to compare the distribution of survey responses or customer preferences across different demographic groups or time periods, providing insights into market trends and changes in consumer behavior. For instance, a market researcher can compare the distribution of customer satisfaction scores from two different regions to identify any significant differences and tailor marketing strategies accordingly.

  • Healthcare Analytics: In healthcare, the package can be used to compare the distribution of patient outcomes or treatment responses across different groups, aiding in the evaluation of treatment effectiveness and identifying potential disparities. For example, a healthcare analyst can compare the distribution of recovery times for patients receiving two different treatments to determine which treatment is more effective and identify any disparities in treatment outcomes.

Installation

pip install pdistmap

How to use it

Method 1

from pdistmap.set import KDEIntersection
import numpy as np

A = np.array([25, 40, 70, 65, 69, 75, 80, 85])
B = np.array([25, 40, 70, 65, 69, 75, 80, 85, 81, 90])

area = KDEIntersection(A,B).intersection_area()
print(area) # Expected output: 0.8752770150023454


KDEIntersection(A,B).intersection_area(plot = True)

Sample Image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdistmap-0.5.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdistmap-0.5.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file pdistmap-0.5.0.tar.gz.

File metadata

  • Download URL: pdistmap-0.5.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.9.12 Linux/6.8.0-51-generic

File hashes

Hashes for pdistmap-0.5.0.tar.gz
Algorithm Hash digest
SHA256 d8bbaa290f88a9c3638ced311a3fdd882d12c27057295dc87d92b04c07685612
MD5 8b2847c8a42e8c4f37358b1cca1b3962
BLAKE2b-256 7791b3f83e39fbd93550f08f0334f23424e718e4d9e550cd78a914121a954e28

See more details on using hashes here.

File details

Details for the file pdistmap-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: pdistmap-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.9.12 Linux/6.8.0-51-generic

File hashes

Hashes for pdistmap-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 88c6b7b6c4cc83ac188f68033c3ebf26e5951a1f8f2c9d74cb342d3a047ebab5
MD5 046b5796042a8b78d35b68d756d46f53
BLAKE2b-256 ff3c108aecec394bfc2f104d62f32e5ef8e1db15c124b1e5095cd870b626696a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page