Skip to main content

A scalable manifold learning (SUDE) method that can cope with large-scale and high-dimensional data in an efficient manner.

Project description

Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data (SUDE)

We propose a scalable manifold learning (SUDE) method that can cope with large-scale and high-dimensional data in an efficient manner. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data, and then incorporates the non-landmarks into this skeleton based on the constrained locally linear embedding.

This repository provides the Python version of SUDE. The MATLAB version can be found at https://github.com/ZPGuiGroupWhu/sude. The related paper has been published in Nature Machine Intelligence: https://www.nature.com/articles/s42256-025-01112-9.

image

Project layout

The project now follows the structure of the scikit-learn-contrib/project-template:

.
|-- .github/workflows/
|-- benchmarks/
|-- doc/
|-- examples/
|-- image/
|-- sude/
|   |-- __init__.py
|   |-- _sude.py
|   |-- _version.py
|   `-- tests/
|-- pyproject.toml
`-- README.md

Installation

Supported python versions are 3.8 and above.

This project has been uploaded to PyPI, supporting direct download and installation from pypi

pip install sude

Manual installation

git clone https://github.com/ZPGuiGroupWhu/SUDE-pkg.git
cd SUDE-pkg
pip install -e .

How to run

The package now exposes both a scikit-learn style estimator class and the legacy function wrapper.

Estimator interface

import numpy as np
from sude import SUDE
import time
import matplotlib.pyplot as plt

# Input data
data = np.loadtxt("benchmarks/rice.csv", delimiter=",")

# Obtain data size and true annotations
m = data.shape[1]
X = data[:, :m - 1]
ref = data[:, m - 1]

# Fit a scikit-learn style estimator
start_time = time.time()
model = SUDE(
    n_components=2,
    n_neighbors=10,
    init="pca",
    max_iter=50,
)
Y = model.fit_transform(X)
end_time = time.time()
print("Elapsed time:", end_time - start_time, 's')

plt.scatter(Y[:, 0], Y[:, 1], c=ref, cmap='tab10', s=4)
plt.show()

The estimator provides the familiar API:

model = SUDE(n_components=2, n_neighbors=10, init="spectral")
Y_train = model.fit_transform(X_train)
Y_test = model.transform(X_test)

Function interface

The original function entry point remains available for backwards compatibility:

from sude import sude

Y = sude(X, no_dims=2, k1=10, initialize="le", T_epoch=50)

Run the packaged example with:

uv run python examples/plot_sude_embedding.py

Run the test suite with:

uv run python -m unittest discover -s sude/tests

Citation request

Peng, D., Gui, Z., Wei, W. et al. Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data. Nat. Mach. Intell. (2025). https://doi.org/10.1038/s42256-025-01112-9

License

SUDE is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sude-0.1.5.tar.gz (4.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sude-0.1.5-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file sude-0.1.5.tar.gz.

File metadata

  • Download URL: sude-0.1.5.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for sude-0.1.5.tar.gz
Algorithm Hash digest
SHA256 b694d3318ad6fcf3f28d192bb00d9958437c4686bb285c3b698691ff0c818ee4
MD5 643819ee144684f6e7ed3f2e9f3e9c5f
BLAKE2b-256 c26bd28c6d12975d5cbb34a3f6dae30765e31c6c925adfa39723d036f8596efc

See more details on using hashes here.

File details

Details for the file sude-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: sude-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for sude-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 01f1e8030211fa17285f4a777651042c0abe33168467289615f7b8739497f27e
MD5 4f2ca2c7f7130e84335584d5ec3552b1
BLAKE2b-256 da0f0f1633f13bc6b70025f6ef056825e4a7132f2247bf1367756317ac07fc14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page