Skip to main content

Graph theoretic scatterplot diagnostics

Project description

pyscagnostics

Python wrapper for computing graph theoretic scatterplot diagnostics.

Scagnostics describe various measures of interest for pairs of variables, based on their appearance on a scatterplot. They are useful tool for discovering interesting or unusual scatterplots from a scatterplot matrix, without having to look at every individual plot.

Wilkinson L., Anand, A., and Grossman, R. (2006). High-Dimensional visual analytics: Interactive exploration guided by pairwise views of point distributions. IEEE Transactions on Visualization and Computer Graphics, November/December 2006 (Vol. 12, No. 6) pp. 1363-1372.

Installation

pip install pyscagnostics

Usage

from pyscagnostics import scagnostics

# Using NumPy arrays or lists
measures, _ = scagnostics(x, y)
print(measures)

# Using Pandas DataFrame
all_measures = scagnostics(df)
for measures, _ in all_measures:
    print(measures)

Documentation

def scagnostics(
    *args,
    bins: int=50,
    remove_outliers: bool=True
) -> Tuple[dict, np.ndarray]:
    """Scatterplot diagnostic (scagnostic) measures

    Scagnostics describe various measures of interest for pairs of variables,
    based on their appearance on a scatterplot.  They are useful tool for
    discovering interesting or unusual scatterplots from a scatterplot matrix,
    without having to look at every individual plot.

    Example:
        `scagnostics` can take an x, y pair of iterables (e.g. lists or NumPy arrays):
        ```
            from pyscagnostics import scagnostics
            import numpy as np

            # Simulate data for example
            x = np.random.uniform(0, 1, 100)
            y = np.random.uniform(0, 1, 100)

            measures, bins = scagnostics(x, y)
        ```

        A Pandas DataFrame can also be passed as the singular required argument. The
        output will be a generator of results:
        ```
            from pyscagnostics import scagnostics
            import numpy as np
            import pandas as pd

            # Simulate data for example
            x = np.random.uniform(0, 1, 100)
            y = np.random.uniform(0, 1, 100)
            z = np.random.uniform(0, 1, 100)
            df = pd.DataFrame({
                'x': x,
                'y': y,
                'z': z
            })

            results = scagnostics(df)
            for x, y, result in results:
                measures, bins = result
                print(measures)
        ```

    Args:
        *args:
            x, y: Lists or numpy arrays
            df: A Pandas DataFrame
        bins: Max number of bins for the hexagonal grid axis
            The data are internally binned starting with a (bins x bins) hexagonal grid
            and re-binned with smaller bin sizes until less than 250 empty bins remain.
        remove_outliers: If True, will remove outliers before calculations

    Returns:
        (measures, bins)
            measures is a dict with scores for each of 9 scagnostic measures.
                See pyscagnostics.measure_names for a list of measures

            bins is a 3 x n numpy array of x-coordinates, y-coordinates, and
                counts for the hex-bin grid. The x and y coordinates are re-scaled
                between 0 and 1000. This is returned for debugging and inspection purposes.

        If the input is a DataFrame, the output will be a generator yielding a tuples of
        scagnostic results for each column pair:
            (x, y, (measures, bins))
    """

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pyscagnostics, version 0.1.0a4
Filename, size File type Python version Upload date Hashes
Filename, size pyscagnostics-0.1.0a4-cp36-cp36m-macosx_10_14_x86_64.whl (259.5 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp36-cp36m-manylinux1_i686.whl (762.5 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp36-cp36m-manylinux1_x86_64.whl (794.6 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp36-cp36m-manylinux2010_i686.whl (762.5 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp36-cp36m-manylinux2010_x86_64.whl (794.6 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp36-cp36m-win_amd64.whl (249.0 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp37-cp37m-macosx_10_14_x86_64.whl (256.8 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp37-cp37m-manylinux1_i686.whl (762.3 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp37-cp37m-manylinux1_x86_64.whl (794.5 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp37-cp37m-manylinux2010_i686.whl (762.3 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp37-cp37m-manylinux2010_x86_64.whl (794.5 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp37-cp37m-win_amd64.whl (249.2 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp38-cp38-macosx_10_14_x86_64.whl (256.9 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp38-cp38-manylinux1_i686.whl (824.4 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp38-cp38-manylinux1_x86_64.whl (863.9 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp38-cp38-manylinux2010_i686.whl (824.4 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp38-cp38-manylinux2010_x86_64.whl (863.9 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size pyscagnostics-0.1.0a4-cp38-cp38-win_amd64.whl (251.8 kB) File type Wheel Python version cp38 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page