Self Organizing Maps efficient implementation using PyTorch
Project description
Self-Organizing Map
PyTorch implementation of a Self-Organizing Map. The implementation makes possible the use of a GPU if available for faster computations. It follows the scikit package semantics for training and usage of the model.
Requirements
The SOM object requires numpy, scipy and torch installed.
The graph-based clustering requires scikit-learn and the image-based clustering requires scikit-image. By default, we use the graph-based clustering
The toy example uses scikit-learn for the toy dataset generation
The MD application requires pymol for loading the trajectory
Then one can run :
pip install quicksom
SOM object interface
The SOM object can be created using any grid size, with a optional periodic topology. One can also choose optimization parameters such as the number of epochs to train or the batch size
import pickle
import numpy
import torch
from som import SOM
device = 'cuda' if torch.cuda.is_available() else 'cpu'
X = numpy.load('contact_desc.npy')
X = torch.from_numpy(X)
X = X.float()
X = X.to(device)
m, n = 100, 100
dim = X.shape[1]
niter = 5
batch_size = 100
som = SOM(m, n, dim, niter=niter, device=device)
learning_error = som.fit(X, batch_size=batch_size)
bmus, inference_error = som.predict(X, batch_size=batch_size)
predicted_clusts, errors = som.predict_cluster(X)
som.to_device('cpu')
pickle.dump(som, open('som.pickle', 'wb'))
Inference and analysis script sample:
import pickle
import numpy
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
som = pickle.load(open('som.pickle', 'rb'))
som.to_device(device)
som.cluster(min_distance=1)
X = numpy.load('contact_desc.npy')
X = torch.from_numpy(X)
X = X.float()
X = X.to(device)
smap = som.centroids.reshape((som.m, som.n, -1))
som.smap = smap
bmus, inference_error = som.predict(X)
som.bmus = bmus
som.inference_error = inference_error
predicted_clusts, errors = som.predict_cluster(X)
som.predicted_clusts = predicted_clusts
som.errors = errors
som.to_device('cpu')
pickle.dump(som, open('som.pickle', 'wb'))
$ ./main.py
training ... cpu
_parameters ... cpu
_buffers ... cpu
_non_persistent_buffers_set ... cpu
_backward_hooks ... cpu
_forward_hooks ... cpu
_forward_pre_hooks ... cpu
_state_dict_hooks ... cpu
_load_state_dict_pre_hooks ... cpu
_modules ... cpu
m ... cpu
n ... cpu
grid_size ... cpu
dim ... cpu
periodic ... cpu
p_norm ... cpu
sched ... cpu
niter ... cpu
alpha ... cpu
sigma ... cpu
centroids -> cpu
locations -> cpu
maprange -> cpu
offset1 -> cpu
offset2 -> cpu
offset3 -> cpu
offset4 -> cpu
offset5 -> cpu
offset6 -> cpu
offset7 -> cpu
offset8 -> cpu
precompute ... cpu
distance_mat -> cpu
umat ... cpu
cluster_att ... cpu
alpha_op ... cpu
sigma_op ... cpu
Input dataset:
Umatrix:
Data projection:
Cluster projection:
Cluster affectation:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.