Skip to main content

A scalable manifold learning (SUDE) method that can cope with large-scale and high-dimensional data in an efficient manner.

Project description

Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data (SUDE)

We propose a scalable manifold learning (SUDE) method that can cope with large-scale and high-dimensional data in an efficient manner. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data, and then incorporates the non-landmarks into this skeleton based on the constrained locally linear embedding.

This repository provides the Python version of SUDE. The MATLAB version can be found at https://github.com/ZPGuiGroupWhu/sude. The related paper has been published in Nature Machine Intelligence: https://www.nature.com/articles/s42256-025-01112-9.

image

Project layout

The project now follows the structure of the scikit-learn-contrib/project-template:

.
|-- .github/workflows/
|-- benchmarks/
|-- doc/
|-- examples/
|-- image/
|-- sude/
|   |-- __init__.py
|   |-- _sude.py
|   |-- _version.py
|   `-- tests/
|-- pyproject.toml
`-- README.md

Installation

Supported python versions are 3.8 and above.

This project has been uploaded to PyPI, supporting direct download and installation from pypi

pip install sude

Manual installation

git clone https://github.com/ZPGuiGroupWhu/SUDE-pkg.git
cd SUDE-pkg
pip install -e .

How to run

The package now exposes both a scikit-learn style estimator class and the legacy function wrapper.

Estimator interface

import numpy as np
from sude import SUDE
import time
import matplotlib.pyplot as plt

# Input data
data = np.loadtxt("benchmarks/rice.csv", delimiter=",")

# Obtain data size and true annotations
m = data.shape[1]
X = data[:, :m - 1]
ref = data[:, m - 1]

# Fit a scikit-learn style estimator
start_time = time.time()
model = SUDE(
    n_components=2,
    n_neighbors=10,
    init="pca",
    max_iter=50,
)
Y = model.fit_transform(X)
end_time = time.time()
print("Elapsed time:", end_time - start_time, 's')

plt.scatter(Y[:, 0], Y[:, 1], c=ref, cmap='tab10', s=4)
plt.show()

The estimator provides the familiar API:

model = SUDE(n_components=2, n_neighbors=10, init="spectral")
Y_train = model.fit_transform(X_train)
Y_test = model.transform(X_test)

Function interface

The original function entry point remains available for backwards compatibility:

from sude import sude

Y = sude(X, no_dims=2, k1=10, initialize="le", T_epoch=50)

Run the packaged example with:

uv run python examples/plot_sude_embedding.py

Run the test suite with:

uv run python -m unittest discover -s sude/tests

Citation request

Peng, D., Gui, Z., Wei, W. et al. Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data. Nat. Mach. Intell. (2025). https://doi.org/10.1038/s42256-025-01112-9

License

SUDE is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sude-0.1.4.tar.gz (4.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sude-0.1.4-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file sude-0.1.4.tar.gz.

File metadata

  • Download URL: sude-0.1.4.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for sude-0.1.4.tar.gz
Algorithm Hash digest
SHA256 8bfac0ffc7880cfb9d98c63468b48e8ebd2ac9c951fcbec4cea960df771ff93c
MD5 f5972fd97543516627870040de60d248
BLAKE2b-256 eaa94aad8273ef3572b87bcbd84779c1efde6baa01edaa3e4aed6c999fb1a243

See more details on using hashes here.

File details

Details for the file sude-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: sude-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for sude-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ab58477f4a258c756faca126f0d0a80e133aeccaf770bc33fd0b82990990b9f7
MD5 5b0a0340f18c9401d1532d343173d4cd
BLAKE2b-256 c750def7eea5a8d59e279eeb14890b519be4379ac6e360ea0465ac01f82c8806

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page