Dimensionality reduction through Simplified Topological Abstraction of Data
Project description
pySTAD - Python implementation of Simplified Topological Abstraction of Data
Installation
pip install stad
Usage
The input to stad is a normalised distance matrix (i.e. with values between 0 and 1). Optionally, you can also provide an array of values for each datapoint that can be used in the lens.
Let's for example look at the five circles dataset that is used in the example script below. Without a lens, a stad analysis will reveal a circle with four spikes; with a lens each of these spikes itself also becomes a circle (as in the picture).
The data for this dataset looks like this:
x,y,hue
377,566,#1F988B
362,589,#21A585
350,607,#29AF7F
104,977,#20928C
124,978,#26818E
118,956,#1F9E89
...
Here's a complete script to create this graph:
import stad
import pandas as pd
## Load the data
url = 'https://gist.githubusercontent.com/jandot/a84c0505cdc8008a6e5ae5032532a39f/raw/d834527117fd204d33486998d10290251354d013/five_circles.csv'
data = pd.read_csv(url, header=0)
## Extract the values we want to use in our distance, the lens, and optional features
values = data[['x','y']].values.tolist()
lens = data['hue'].map(lambda x:stad.hex_to_hsv(x)[0]).values
xs = data['x'].values.tolist()
ys = data['y'].values.tolist()
hues = data['hue'].values.tolist()
## Create the distance matrix in the high_dimensional space. This can be using
## cosine distance, euclidean, or any other.
highD_dist_matrix = stad.calculate_highD_dist_matrix(values)
## Run STAD and show the result
g = stad.run_stad(highD_dist_matrix, lens=lens, features={'x':xs, 'y':ys, 'hue': hues})
stad.draw_stad(g)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stad-2.0.1.tar.gz.
File metadata
- Download URL: stad-2.0.1.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a9aa9ef87a9802d211e6c799ce7effac5b34cb74c0d397dde9374b8a36aa70a
|
|
| MD5 |
c7d844a2ed2f387c50bfd13179a6fd7a
|
|
| BLAKE2b-256 |
581fc4fbeaec08cd13442e26c2bdc53547864b3d463abadc4aa11bd4944945de
|
File details
Details for the file stad-2.0.1-py3-none-any.whl.
File metadata
- Download URL: stad-2.0.1-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcb996784aa5f685fdaf85bfe8149c45b80e768eac98dbdecc1713c92fb4db90
|
|
| MD5 |
a8712dc455132bcdada7bcd4fc5d42ec
|
|
| BLAKE2b-256 |
f3815eb74c96413b3547a2100fd94fb41afc3775bc619a8ceddafa3171adc63d
|