Skip to main content

Graph theoretic scatterplot diagnostics

Project description

pyscagnostics

Python wrapper for computing graph theoretic scatterplot diagnostics.

Scagnostics describe various measures of interest for pairs of variables, based on their appearance on a scatterplot. They are useful tool for discovering interesting or unusual scatterplots from a scatterplot matrix, without having to look at every individual plot.

Wilkinson L., Anand, A., and Grossman, R. (2006). High-Dimensional visual analytics: Interactive exploration guided by pairwise views of point distributions. IEEE Transactions on Visualization and Computer Graphics, November/December 2006 (Vol. 12, No. 6) pp. 1363-1372.

Installation

pip install pyscagnostics

Usage

from pyscagnostics import scagnostics

# Using NumPy arrays or lists
measures, _ = scagnostics(x, y)
print(measures)

# Using Pandas DataFrame
all_measures = scagnostics(df)
for measures, _ in all_measures:
    print(measures)

Documentation

def scagnostics(
    *args,
    bins: int=50,
    remove_outliers: bool=True
) -> Tuple[dict, np.ndarray]:
    """Scatterplot diagnostic (scagnostic) measures

    Scagnostics describe various measures of interest for pairs of variables,
    based on their appearance on a scatterplot.  They are useful tool for
    discovering interesting or unusual scatterplots from a scatterplot matrix,
    without having to look at every individual plot.

    Example:
        `scagnostics` can take an x, y pair of iterables (e.g. lists or NumPy arrays):
        ```
            from pyscagnostics import scagnostics
            import numpy as np

            # Simulate data for example
            x = np.random.uniform(0, 1, 100)
            y = np.random.uniform(0, 1, 100)

            measures, bins = pyscagnostics.scagnostics(x, y)
        ```

        A Pandas DataFrame can also be passed as the singular required argument. The
        output will be a generator of results:
        ```
            from pyscagnostics import scagnostics
            import numpy as np
            import pandas as pd

            # Simulate data for example
            x = np.random.uniform(0, 1, 100)
            y = np.random.uniform(0, 1, 100)
            z = np.random.uniform(0, 1, 100)
            df = pd.DataFrame({
                'x': x,
                'y': y,
                'z': z
            })

            results = pyscagnostics.scagnostics(df)
            for measures, bins in results:
                print(measures)
        ```

    Args:
        *args:
            x, y: Lists or numpy arrays
            df: A Pandas DataFrame
        bins: Max number of bins for the hexagonal grid axis
            The data are internally binned starting with a (bins x bins) hexagonal grid
            and re-binned with smaller bin sizes until less than 250 empty bins remain.
        remove_outliers: If True, will remove outliers before calculations

    Returns:
        (measures, bins)
            measures is a dict with scores for each of 9 scagnostic measures.
                See pyscagnostics.measure_names for a list of measures

            bins is a 3 x n numpy array of x-coordinates, y-coordinates, and 
                counts for the hex-bin grid. The x and y coordinates are re-scaled
                between 0 and 1000. This is returned for debugging and inspection purposes.

        If the input is a DataFrame, the output will be a generator yielding scagnostics
        for each combination of column pairs
    """

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pyscagnostics-0.1.0a2-cp38-cp38-win_amd64.whl (252.0 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

pyscagnostics-0.1.0a2-cp38-cp38-manylinux2010_x86_64.whl (866.2 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pyscagnostics-0.1.0a2-cp38-cp38-manylinux2010_i686.whl (826.0 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

pyscagnostics-0.1.0a2-cp38-cp38-manylinux1_x86_64.whl (866.2 kB view hashes)

Uploaded CPython 3.8

pyscagnostics-0.1.0a2-cp38-cp38-manylinux1_i686.whl (826.0 kB view hashes)

Uploaded CPython 3.8

pyscagnostics-0.1.0a2-cp38-cp38-macosx_10_14_x86_64.whl (257.0 kB view hashes)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pyscagnostics-0.1.0a2-cp37-cp37m-win_amd64.whl (249.4 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

pyscagnostics-0.1.0a2-cp37-cp37m-manylinux2010_x86_64.whl (795.3 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pyscagnostics-0.1.0a2-cp37-cp37m-manylinux2010_i686.whl (763.3 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

pyscagnostics-0.1.0a2-cp37-cp37m-manylinux1_x86_64.whl (795.3 kB view hashes)

Uploaded CPython 3.7m

pyscagnostics-0.1.0a2-cp37-cp37m-manylinux1_i686.whl (763.3 kB view hashes)

Uploaded CPython 3.7m

pyscagnostics-0.1.0a2-cp37-cp37m-macosx_10_14_x86_64.whl (256.9 kB view hashes)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pyscagnostics-0.1.0a2-cp36-cp36m-win_amd64.whl (249.2 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

pyscagnostics-0.1.0a2-cp36-cp36m-manylinux2010_x86_64.whl (795.6 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pyscagnostics-0.1.0a2-cp36-cp36m-manylinux2010_i686.whl (763.2 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ i686

pyscagnostics-0.1.0a2-cp36-cp36m-manylinux1_x86_64.whl (795.6 kB view hashes)

Uploaded CPython 3.6m

pyscagnostics-0.1.0a2-cp36-cp36m-manylinux1_i686.whl (763.2 kB view hashes)

Uploaded CPython 3.6m

pyscagnostics-0.1.0a2-cp36-cp36m-macosx_10_14_x86_64.whl (259.7 kB view hashes)

Uploaded CPython 3.6m macOS 10.14+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page