Skip to main content

A Fast Self-Organizing Map Python Library Implemented in Numba

Project description

Welcome to NumbaSOM

A fast Self-Organizing Map Python library implemented in Numba.

This is a fast and simple to use SOM library. It utilizes online training (one data point at the time) rather than batch training. The implemented topologies are a simple 2D lattice or a torus.

How to Install

To install this package with pip run:

pip install numbasom

To install this package with conda run:

conda install -c conda-forge numbasom

How to use

To import the library you can safely use:

from numbasom import *

A Self-Organizing Map is often used to show the underlying structure in data. To show how to use the library, we will train it on 200 random 3-dimensional vectors (so we can render them as colors):

Create 200 random colors

import numpy as np
data = np.random.random([200,3])

Initialize the library

We initalize a map with 50 rows and 100 columns. The default topology is a 2D lattice. We can also train it on a torus by setting is_torus=True

som = SOM(som_size=(50,100), is_torus=False)

Train the SOM

We will adapt the lattice by iterating 10.000 times through our data points. If we set normalize=True, data will be normalized before training.

lattice = som.train(data, num_iterations=15000)
SOM training took: 0.366633 seconds.

To access an individual cell type

lattice[5,3]
array([0.92550425, 0.20740594, 0.92610555])

To access multiple cells, slicing works

lattice[1::6,1]
array([[0.7473038 , 0.09876245, 0.93051731],
       [0.93542156, 0.18717452, 0.87611239],
       [0.9486485 , 0.11080808, 0.57557379],
       [0.87873391, 0.13527156, 0.368202  ],
       [0.78669284, 0.11830203, 0.2634972 ],
       [0.68213238, 0.06408478, 0.33050376],
       [0.54769163, 0.05391318, 0.31153485],
       [0.63722088, 0.12484291, 0.0684501 ],
       [0.64172725, 0.01517416, 0.09549566]])

The shape of the lattice should be (50, 100, 3)

lattice.shape
(50, 100, 3)

Visualizing the lattice

Since our lattice is made of 3-dimensional vectors, we can represent it as a lattice of colors.

import matplotlib.pyplot as plt

plt.imshow(lattice)
plt.show()

png

Compute U-matrix

Since the most of the data will not be 3-dimensional, we can use the u_matrix (unified distance matrix by Alfred Ultsch) to visualise the map and the clusters emerging on it.

um = u_matrix(lattice)

Each cell of the lattice is just a single value, thus the shape is:

um.shape
(50, 100)

Plot U-matrix

The library contains a function plot_u_matrix that can help visualise it.

plot_u_matrix(um, fig_size=(6.2,6.2))

png

Project on the lattice

To project data on the lattice, use project_on_lattice function.

Let's project a couple of predefined color on the trained lattice and see in which cells they will end up:

colors = np.array([[1.,0.,0.],[0.,1.,0.],[0.,0.,1.],[1.,1.,0.],[0.,1.,1.],[1.,0.,1.],[0.,0.,0.],[1.,1.,1.]])
color_labels = ['red', 'green', 'blue', 'yellow', 'cyan', 'purple','black', 'white']
projection = project_on_lattice(colors, lattice, additional_list=color_labels)

for p in projection:
    if projection[p]:
        print (p, projection[p][0])
Projecting on SOM took: 0.158945 seconds.
(0, 85) blue
(2, 39) white
(5, 1) purple
(10, 60) cyan
(41, 59) green
(49, 12) red
(49, 40) yellow
(49, 96) black

Find every cell's closest vector in the data

To find every cell's closes vector in the provided data, use lattice_closest_vectors function.

We can again use the colors example:

closest = lattice_closest_vectors(colors, lattice, additional_list=color_labels)
Finding closest data points took: 0.003056 seconds.

We can ask now to which value in color_labels are out lattice cells closest to:

closest[(1,1)]
['purple']
closest[(40,80)]
['green']

We can find the closest vectors without supplying an additional list. Then we get the association between the lattice and the data vectors that we can display as colors.

closest_vec = lattice_closest_vectors(colors, lattice)
Finding closest data points took: 0.003491 seconds.

We take the values of the closest_vec vector and reshape it into a numpy vector values.

values = np.array(list(closest_vec.values())).reshape(50,100,-1)

We can now visualise the projection of our 8 hard-coded colors onto the lattice:

plt.imshow(values)
plt.show()

png

Compute how each data vector 'activates' the lattice

We can use the function lattice_activations:

activations = lattice_activations(colors, lattice)
Computing SOM activations took: 0.000484 seconds.

Now we can show how the vector blue: [0.,0.,1.] activates the lattice:

plt.imshow(activations[2])
plt.show()

png

If we wish to scale the higher values up, and scale down the lower values, we can use the argument exponent when computing the activations:

activations = lattice_activations(colors, lattice, exponent=8)
Computing SOM activations took: 0.000838 seconds.
plt.imshow(activations[2])
plt.show()

png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numbasom-0.1.0.tar.gz (130.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

numbasom-0.1.0-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file numbasom-0.1.0.tar.gz.

File metadata

  • Download URL: numbasom-0.1.0.tar.gz
  • Upload date:
  • Size: 130.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for numbasom-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d424ca4907b54d46dab3e3aa3d438568d64a50e20a3539419602b43e00083ed5
MD5 0051e042de43b8a561ddcb7f0a67d094
BLAKE2b-256 35cd0cf7c9469697a59502d5c5465550662d8c8e3419db8835174da8026d0f2c

See more details on using hashes here.

Provenance

The following attestation bundles were made for numbasom-0.1.0.tar.gz:

Publisher: publish.yml on nmarincic/numbasom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file numbasom-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: numbasom-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for numbasom-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 100203f2c1c794bcf952be112720fa1f13993053422e75ff2753d5f3e4844dfb
MD5 716c668d5c14ad142efa3556dbe45add
BLAKE2b-256 8bdb9ad34f8928b49a901fef46b9f2947d2c17f56f378a9c5c3d2933f6f0361e

See more details on using hashes here.

Provenance

The following attestation bundles were made for numbasom-0.1.0-py3-none-any.whl:

Publisher: publish.yml on nmarincic/numbasom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page