Skip to main content

A Fast Self-Organizing Map Python Library Implemented in Numba

Project description

Welcome to NumbaSOM

A fast Self-Organizing Map Python library implemented in Numba.

This is a fast and simple to use SOM library. It utilizes online training (one data point at the time) rather than batch training. The implemented topologies are a simple 2D lattice or a torus.

How to Install

The installation is available at PyPI. Simply type:

pip install numbasom

How to use

A Self-Organizing Map is often used to show the underlying structure in data. To show how to use the library, we will train it on 200 random 3-dimensional vectors (so we can render them as colors):

import numpy as np
from numbasom import SOM

Create 200 random colors

data = np.random.random([200,3])

Initialize the library

We initalize a large map with 20 rows and 40 columns. The default topology is a 2D lattice. We can also train it on a torus by setting is_torus=True

som = SOM(som_size=(50,100), is_torus=False)

Train the SOM

We will adapt the lattice by iterating 1000 times through our data points. If we set ìs_scaled=False, data will be normalized before training.

lattice = som.train(data, num_iterations=10000, is_scaled=True)
SOM training took: 0.343299 seconds.

We can display a number of lattice cells to make sure they are 3-dimensional vectors

lattice[1::6,1]
array([[0.84320274, 0.30492829, 0.41450252],
       [0.74273113, 0.24442997, 0.46775752],
       [0.7079431 , 0.18936966, 0.3849527 ],
       [0.67390875, 0.13944585, 0.1926078 ],
       [0.511751  , 0.04086609, 0.17727931],
       [0.40736032, 0.21150301, 0.0357543 ],
       [0.35872757, 0.28562641, 0.0832515 ],
       [0.16635622, 0.08152816, 0.38226892],
       [0.05055562, 0.05010728, 0.34043282]])

The shape of the lattice is (20, 40, 3) as expected

lattice.shape
(50, 100, 3)

Visualizing the lattice

Since our lattice is made of 3-dimensional vectors, we can represent it as a lattice of colors.

import matplotlib.pyplot as plt

plt.imshow(lattice)
plt.show()

png

Compute U-matrix

Since the most of the data will not be 3-dimensional, we can use the U-matrix (unified distance matrix by Alfred Ultsch) to visualise the map and the clusters emerging on it.

from numbasom import u_matrix

um = u_matrix(lattice)
um.shape
(50, 100)

Plot U-matrix

The library contains a function plot_u_matrix that can help visualise it.

from numbasom import plot_u_matrix

plot_u_matrix(um, fig_size=(6.2,6.2))

png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numbasom-0.0.2.tar.gz (13.8 kB view hashes)

Uploaded Source

Built Distribution

numbasom-0.0.2-py3-none-any.whl (11.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page