Skip to main content

Noise contrastive data visualization

Project description

Conda PyPI GitHub Build Status

ncvis

NCVis is an efficient solution for data visualization. It uses HNSW for fast nearest neighbors graph construction and a parallel approach for building the graph embedding.

Installation

Conda [recommended]

You do not need to setup the environment if using conda, all dependencies are installed automatically.

$ conda install -c alartum ncvis 

Pip

Important: be sure to have a compiler with OpenMP support. GCC has it by default, wich is not the case with clang. You may need to install llvm-openmp library beforehand.

Install numpy and cython packages (compile-time dependencies):

$ pip install numpy cython

Install ncvis package:

$ pip install ncvis

Using

import ncvis

vis = ncvis.NCVis()
Y = vis.fit(X)

A more detailed example can be found here.

Experiments

Datasets can be dowloaded by using the download.sh script:

$ bash examples/data/download.sh <dataset>

Replace <dataset> with corresponding entry from the table. You can also download all of them at once:

$ bash examples/data/download.sh

The datasets can be then accessed by using interfaces from the data Python module. Make sure that these packages are installed:

$ pip install -r examples/requirements-pip.txt

or

$ conda install --file examples/requirements-conda.txt
Dataset <dataset> Dataset Class
MNIST mnist MNIST
Fashion MNIST fmnist FMNIST
Iris iris Iris
Handwritten Digits pendigits PenDigits
COIL-20 coil20 COIL20
COIL-100 coil100 COIL100
Mouse scRNA-seq scrna ScRNA
Statlog (Shuttle) shuttle Shuttle
Flow Cytometry flow not yet
GoogleNews news not yet

Each dataset can be used in the following way:

Sample Code Action
d = data.MNIST() Load the dataset.
ds.X Get the samples as numpy array of shape (n_samples, n_dimensions). If samples have more than one dimension they are all flattened.
ds.y Get the labels of the samples.
len(ds) Get total number of samples.
ds[0] Get 0-th pair (sample, label) from the dataset.
ds.shape Get the original shape of the samples. For example, it equals to (28, 28) for MNIST.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncvis-1.0.tar.gz (189.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page