Skip to main content

Dimensionality reduction through Simplified Topological Abstraction of Data

Project description

pySTAD - Python implementation of Simplified Topological Abstraction of Data

Installation

pip install stad

Usage

The input to stad is a normalised distance matrix (i.e. with values between 0 and 1). Optionally, you can also provide an array of values for each datapoint that can be used in the lens.

Let's for example look at the five circles dataset that is used in the example script below. Without a lens, a stad analysis will reveal a circle with four spikes; with a lens each of these spikes itself also becomes a circle (as in the picture).

The data for this dataset looks like this:

x,y,hue
377,566,#1F988B
362,589,#21A585
350,607,#29AF7F
104,977,#20928C
124,978,#26818E
118,956,#1F9E89
...

Here's a complete script to create this graph:

import stad
import pandas as pd

## Load the data
url = 'https://gist.githubusercontent.com/jandot/a84c0505cdc8008a6e5ae5032532a39f/raw/d834527117fd204d33486998d10290251354d013/five_circles.csv'
data = pd.read_csv(url, header=0)

## Extract the values we want to use in our distance, the lens, and optional features
values = data[['x','y']].values.tolist()
lens = data['hue'].map(lambda x:stad.hex_to_hsv(x)[0]).values
xs = data['x'].values.tolist()
ys = data['y'].values.tolist()
hues = data['hue'].values.tolist()

## Create the distance matrix in the high_dimensional space. This can be using
## cosine distance, euclidean, or any other.
highD_dist_matrix = stad.calculate_highD_dist_matrix(values)

## Run STAD and show the result
g = stad.run_stad(highD_dist_matrix, lens=lens, features={'x':xs, 'y':ys, 'hue': hues})
stad.draw_stad(g)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stad-2.0.1.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stad-2.0.1-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file stad-2.0.1.tar.gz.

File metadata

  • Download URL: stad-2.0.1.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for stad-2.0.1.tar.gz
Algorithm Hash digest
SHA256 5a9aa9ef87a9802d211e6c799ce7effac5b34cb74c0d397dde9374b8a36aa70a
MD5 c7d844a2ed2f387c50bfd13179a6fd7a
BLAKE2b-256 581fc4fbeaec08cd13442e26c2bdc53547864b3d463abadc4aa11bd4944945de

See more details on using hashes here.

File details

Details for the file stad-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: stad-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for stad-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fcb996784aa5f685fdaf85bfe8149c45b80e768eac98dbdecc1713c92fb4db90
MD5 a8712dc455132bcdada7bcd4fc5d42ec
BLAKE2b-256 f3815eb74c96413b3547a2100fd94fb41afc3775bc619a8ceddafa3171adc63d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page