FeatureCloud Visualization

These details have not been verified by PyPI

Project links

Project description

Featurecloud Cluster Visualization app

This is an interactive cluster visualization app implemented in Dash and Plotly.

App usage

This app is intended to be used in FeatureCloud environment. It requires input data in order to generate the interactive visualization interface. The data must be in the place and in the format specified by this documentation. The app has a tabular interface consisting of:

Confounders
Distances
Clustering Quality
Scree plot
Volcano plot
Help

Confounders tab

Main features:

Cluster or client id field based clustering display
K number selector
Cluster selector
X and Y axes selector
Use pie or bar chart selector for discrete data type visualization
Confounding factors filter
Scatter plot with confidence ellipsis
Linear or logarithmic scale
Point/Lasso/Box selection
Export diagrams to png
Confounding factors diagrams
Visualize and download selected points

Distances tab

Main features:

K number selector
Cluster selector
Confounding factors filter
Clustergram

Clustering Quality tab

Main features:

K number selector
Silhouette plot
K number selector

Scree plot tab

Main features:

Display components' eigenvalue

Volcano plot tab

Main features:

Set effect sizes thresholds (vertical)
Set genome wide line threshold (horizontal)

Help

It displays this documentation.

Input data requirements

Expected folder structure for visual representation

data
└───results
│   └───K2
│       │   clustering.csv
│       │   silhouette.csv
│   └───K3
│       │   clustering.csv
│       │   silhouette.csv
│   └───...
│   └───K<n>
│       │   clustering.csv
│       │   silhouette.csv
│   confoundingData.csv
│   confoundingData.meta    
│   localData.csv
│   distanceMatrix.csv
│   varianceExplained.csv
|   volcano_data.csv

Download

For a better understanding an example data set can be downloaded by clicking here.

Tip for running

When running the app in a workflow, one can upload a zip file containing the config file and data files. It will be automatically unzipped and copied to the input directory of the app. To test this, just upload the example data set mentioned above. When running the app in a workflow, one can trigger the app to finish, by clicking the Finished button in the top right corner. This makes possible starting the next app in the workflow, if any, or stopping the workflow.

Notes:

localData.csv file is mandatory. The rest of files are optional.
All files under K folders (if the folder exists) are mandatory

Delimiter

The default delimiter is the ";" character. It can be overwritten in the config.yml file.

Expected file structure

confoundingData.csv

This file contains all confounding factors related to local data. First column is the id (mandatory), followed by a maximum of 5 of columns of confounders. The confounders column names are arbitrary and must not match reserved column names: id, cluster, client_id

Example

id;age;sex;race;height;sugar-level
1;38;F;Caucasian;159;low
2;17;F;Asian;175;low
3;40;F;African-American;162;medium
4;32;F;Indian;183;high
5;18;F;Indian;193;low

confoundingData.meta

This file contains meta information about confounding factors

Supported data and value types:

Columns:

name: the name of the confounding factor
data_type:
- continous: arbitrary values
- discrete: values from a predefined value set
- ordinal: values from a predefined value set in ordered manner
value_type
- integer
- string
- enumeration values in ordered manner

Example

name;data_type;value_type
age;continuous;integer
sex;discrete;string
race;discrete;string
height;continuous;integer
sugar-level;ordinal;low,medium,high

localData.csv

This file contains the base values. Columns:

id: sample id (mandatory)
client_id: optional field, the app supports display of clustering on this field as well
data columns: at least 2 data columns need to be present. More than 2 data columns are supported. The column names are arbitrary and must not match reserved column names: id, cluster, client_id

Example

id;client_id;x;y;z
1;1;-0.115257648318211;0.289555823437292;0.333954194475931
2;1;-0.226069897739012;0.293898393621215;0.130668954544708
3;1;0.0606059327164007;0.0297344961039227;0.112959671444335
4;1;0.0398616396572761;-0.37563056412847;-0.35560909629883
5;1;-0.21084222999711;0.592948181336414;-0.368794747648271

distanceMatrix.csv

This file contains distances between samples. It is of n x n dimension, where n is the number of sample data.

Example

1;2;3;4;5
1;0;0.53851648071345;0.509901951359278;0.648074069840786;0.141421356237309
2;0.53851648071345;0;0.3;0.331662479035541;0.608276253029822
3;0.509901951359278;0.3;0;0.244948974278318;0.509901951359278
4;0.648074069840786;0.331662479035541;0.244948974278318;0;0.648074069840786
5;0.141421356237309;0.608276253029822;0.509901951359278;0.648074069840786;0

varianceExplained.csv

This file contains the eigenvalues for components. Columns:

component: mandatory field, it contains the name of the component
eigenvalue: mandatory field, it contains the eigenvalue of the component

Example

component;eigenvalue
x;0.729624454
y;0.408507618
z;0.228507618

clustering.csv

This file contains the cluster distribution of the samples. Columns:

id: mandatory, sample id
cluster: mandatory, cluster id

Example

id;cluster
1;1
2;1
3;1
4;1
5;1

silhouette.csv

This file contains data used to display the clusters silhouette plot. Columns:

index column, mandatory
y: mandatory, contains the value to be plotted
cluster: mandatory, contains the cluster id

Example

x;y;cluster
1;0.369499266613275;1
2;0.783307729521766;1
3;0.0627545099705458;1
4;0.205028521828353;1
5;0.915254552382976;1

volcano_data.csv

This file contains data used to display the volcano plot. The columns are the default columns used in Dash Bio Volcano plot library.

Example

CHR;BP;P;SNP;ZSCORE;EFFECTSIZE;GENE;DISTANCE
1;937641;0.335343792801723;rs9697358;0.9634;-0.0946;ISG15;1068
1;1136887;0.245857131900266;rs34945898;1.1605;-0.0947;TNFRSF4;0
1;2116240;0.823285880265757;rs12034613;0.2233;-0.0741;FP7162;0

General requirements for input data

the number of samples has to be the same in all files
the sample ids must be persistent

Config file support

The app supports setting all data file and directory paths from config file. The config.yml file should be placed in the default data directory (mnt/input/data) Example:

fc-cluster-visualization-app:
  delimiter: ';'
  data-dir: 'data/exampleData'
  local-data-path: 'data/exampleData/localData.csv'
  distance-matrix-path: 'data/exampleData/distanceMatrix.csv'
  confounding-meta-path: 'data/exampleData/confoundingData.meta'
  confounding-data-path: 'data/exampleData/confoundingData.csv'
  variance-explained-path: 'data/exampleData/varianceExplained.csv'
  k-values-clustering-result-dir: 'data/exampleData/results'
  k-values-clustering-file-name: 'clustering.csv'
  k-values-silhouette-file-name: 'silhouette.csv'
  volcano-data-path: 'exampleData/volcano_data.csv'
  # all files downloaded from the browser will end up here too
  download-dir: 'data/exampleData/downloads'

If config file is not present, the app will search for data in the default folder (/mnt/input/data). Any key from the config file can be omitted, in that case the app will search in the default data directory. Keys should not be left with blank values.

Limitations

the app supports displaying 5 confounding factors simultaneously
if more than 5 confounding factors are present in the confoundingMeta.csv file, it will display the first 5

Workflow

When the app runs in a FeatureCloud workflow, a Finished button will be displayed in the upper right corner. Clicking on the button terminates the application, while the controller shuts down the Docker container. Also, input folder content will be copied to the output folder.

Screenshots

Confounders tab

Confounding factors filter with scatter plot Confounders tab

Scatter plot with confounding factors diagrams Confounders tab

View selected data from scatter plot Confounders tab

Distances tab

Clustergram Distances tab

Clustering Quality tab

Silhouette diagram Clustering Quality tab

Scree plot

Scree plot tab

Volcano plot

Volcano plot tab

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.0.6

Sep 29, 2022

0.0.0.5

Sep 29, 2022

This version

0.0.0.4

Sep 28, 2022

0.0.0.3

Sep 27, 2022

0.0.0.2

Sep 16, 2022

0.0.0.1

Sep 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fcvisualization-0.0.0.4.tar.gz (1.7 MB view hashes)

Uploaded Sep 28, 2022 Source

Hashes for fcvisualization-0.0.0.4.tar.gz

Hashes for fcvisualization-0.0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`ef6a941ed54ea73e0e5e897b35f84ec20e21c9c92cc1a021734686ce87314ce5`
MD5	`4aecee50ff790097b0776e942fb4e73f`
BLAKE2b-256	`6d4473bebe1bd19257b35849b30353c4ca5af1127c1e9e160ee334d7314e0d69`

fcvisualization 0.0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Featurecloud Cluster Visualization app

App usage

Confounders tab

Distances tab

Clustering Quality tab

Scree plot tab

Volcano plot tab

Help

Input data requirements

Expected folder structure for visual representation

Download

Tip for running

Delimiter

Expected file structure

confoundingData.csv

Example

confoundingData.meta

Supported data and value types:

Example

localData.csv

Example

distanceMatrix.csv

Example

varianceExplained.csv

Example

clustering.csv

Example

silhouette.csv

Example

volcano_data.csv

Example

General requirements for input data

Config file support

Limitations

Workflow

Screenshots

Confounders tab

Distances tab

Clustering Quality tab

Scree plot

Volcano plot

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution