Skip to main content

A package for analyzing the geometry of perceptual manifolds

Project description

Usage Guide for perceptual_manifold_geometry Package

The perceptual_manifold_geometry package provides tools to analyze the geometry of data manifolds, including functions for calculating curvature, density, holes, intrinsic dimension, and nonconvexity.

Installation

First, install the package using pip:

pip install perceptual_manifold_geometry

Importing the Package

Import the package in your Python script:

import perceptual_manifold_geometry as pmg

Available Functions

The package includes the following functions:

  1. quantify_overall_concavity(data, k=20)
  2. calculate_volume(Z, d=1.0)
  3. estimate_holes_ripser(X, threshold=0.1, Persistence_diagrams=False)
  4. estimate_intrinsic_dimension(X, method='TLE')
  5. estimate_nonconvexity(X, n_projections=10, n_components=5, alpha=10000)

Function Descriptions and Examples

1. quantify_overall_concavity(data, k=20)

Calculates the overall concavity of the data manifold.

Parameters:

  • data: numpy.ndarray - Input data points.
  • k: int - Number of nearest neighbors to consider (default: 20).

Returns:

  • float - Overall concavity measure.

Example:

import numpy as np
import perceptual_manifold_geometry as pmg

# Generate random data
data = np.random.rand(100, 3)

# Calculate overall concavity
curvature = pmg.quantify_overall_concavity(data)
print(f"Overall Curvature: {curvature}")

2. calculate_volume(Z, d=1.0)

Calculates the volume and density of the data manifold.

Parameters:

  • Z: numpy.ndarray - Input data points.
  • d: float - Scaling factor (default: 1.0).

Returns:

  • tuple - Volume and density of the data manifold.

Example:

import numpy as np
import perceptual_manifold_geometry as pmg

# Generate random data
data = np.random.rand(100, 3)

# Calculate volume and density
volume, density = pmg.calculate_volume(data)
print(f"Volume: {volume}, Density: {density}")

3. estimate_holes_ripser(X, threshold=0.1, Persistence_diagrams=False)

Estimates the number of holes in the data manifold using persistent homology.

Parameters:

  • X: numpy.ndarray - Input data points.
  • threshold: float - Persistence threshold (default: 0.1).
  • Persistence_diagrams: bool - Whether to plot persistence diagrams (default: False).

Returns:

  • tuple - Number of holes, total size of persistence, mean size of persistence, density of holes.

Example:

import numpy as np
import perceptual_manifold_geometry as pmg

# Generate random data
data = np.random.rand(100, 3)

# Estimate holes and plot persistence diagrams
num_holes, total_size, mean_size, density_holes = pmg.estimate_holes_ripser(data, threshold=0.1, Persistence_diagrams=True)
print(f"Number of Holes: {num_holes}, Total Size: {total_size}, Mean Size: {mean_size}, Density Holes: {density_holes}")

4. estimate_intrinsic_dimension(X, method='TLE')

Estimates the intrinsic dimension of the data manifold.

Parameters:

  • X: numpy.ndarray - Input data points.
  • method: str - Method for estimating intrinsic dimension ('TLE' or 'Covariance').

Returns:

  • float - Estimated intrinsic dimension.

Example:

import numpy as np
import perceptual_manifold_geometry as pmg

# Generate random data
data = np.random.rand(100, 3)

# Estimate intrinsic dimension
intrinsic_dim = pmg.estimate_intrinsic_dimension(data, method='TLE')
print(f"Intrinsic Dimension: {intrinsic_dim}")

5. estimate_nonconvexity(X, n_projections=10, n_components=5, alpha=10000)

Estimates the nonconvexity of the data manifold using random projections.

Parameters:

  • X: numpy.ndarray - Input data points.
  • n_projections: int - Number of random projections to use (default: 10).
  • n_components: int - Number of dimensions to reduce to in each projection (default: 5).
  • alpha: float - Scaling factor for nonconvexity calculation (default: 10000).

Returns:

  • float - Nonconvexity measure.

Example:

import numpy as np
import perceptual_manifold_geometry as pmg

# Generate random data
data = np.random.rand(100, 3)

# Estimate nonconvexity
nonconvexity = pmg.estimate_nonconvexity(data)
print(f"Nonconvexity: {nonconvexity}")

Summary

This guide provides an overview of how to install and use the functions from the perceptual_manifold_geometry package. Each function includes a brief description, parameters, return values, and an example of how to use it. By following these examples, you can leverage the package to analyze the geometry of your data manifolds.

Geometric-metrics-for-perceptual-manifolds-in-deep-neural-networks

Perceptual Manifold in Deep Neural Network

In the neural system, when neurons receive stimuli from the same category with different physical features, a perceptual manifold is formed. The formation of perceptual manifolds helps the neural system to perceive and process objects of the same category with different features distinctly. Recent studies have shown that the response of deep neural networks to images is similar to human vision and follows the manifold distribution law. Specifically, embeddings of natural images are distributed near a low-dimensional manifold embedded in a high-dimensional space.

Given a set of data $X=[x_1, \dots, x_m]$, and a trained deep neural network, $Model = {f(x, \theta_1), g(z, \theta_2)}$, where $f(x, \theta_1)$ and $g(z, \theta_2)$ represent the representation network and classifier of the model, respectively. The representation network extracts $p$-dimensional embeddings $Z=[z_1, \dots, z_m] \in \mathbb{R}^{p \times m}$ for $X$, where $z_i = f(x_i, \theta_1) \in \mathbb{R}^p$. The point cloud manifold formed by the set of embeddings $Z$ is referred to as the perceptual manifold in deep neural networks.

Natural datasets exhibit inherent patterns that can be generalized under the manifold distribution principle: the distribution of a class of data is close to a low-dimensional perceptual manifold. As illustrated, data classification can be viewed as the unraveling and separation of perceptual manifolds. The difficulty of classifying a manifold increases when it is entangled with other perceptible manifolds. Typically, a deep neural network consists of a feature extractor and a classifier. Feature learning can be considered as unfolding perceptual manifolds, where a well-learned feature extractor can often unfold multiple manifolds for the classifier to decode. From this perspective, all factors related to manifold complexity may impact the classification performance of the model. We will provide metrics for the curvature, volume, and separability of perceptual manifolds.

fig24

1. Curvature metrics for perceptual manifolds in deep neural networks (CVPR 2023)

The curvature metric of the perceptual manifold in deep neural networks allows analyzing the fairness of the model from a geometric point of view. For related conclusions on the curvature and model preferences of perceptual manifolds please refer to the paper, Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification

The citation format is:

@inproceedings{ma2023curvature,
  title={Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification},
  author={Ma, Yanbiao and Jiao, Licheng and Liu, Fang and Yang, Shuyuan and Liu, Xu and Li, Lingling},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={15824--15835},
  year={2023}
}

The following is the code to calculate the average Gaussian curvature of the perceptual manifold.

# -*- coding: utf-8 -*-
"""
Created on Thu Jan 18 14:14:15 2024

@author: Yanbiao Ma
"""
from tqdm import tqdm
import numpy as np
from sklearn.neighbors import NearestNeighbors
from scipy.linalg import svd


def estimate_manifold_curvature(X, k=10):
    N, D = X.shape  # N is the number of samples, D is the dimensionality
    local_curvatures = np.zeros(N)

    # Initialize the NearestNeighbors model
    nbrs = NearestNeighbors(n_neighbors=k+1).fit(X)
    
    for i in range(N):
        # Find the k+1 nearest neighbors to include the point itself
        indices = nbrs.kneighbors(X[i].reshape(1, -1), n_neighbors=k+1, return_distance=False)
        # Exclude the point itself
        indices = indices[0][1:]
        # Center the points
        M_i = X[indices] - np.mean(X[indices], axis=0)
        # Perform SVD
        U_i, Sigma_i, _ = svd(M_i, full_matrices=False)
        
        # Create diagonal matrix for Sigma_i
        Sigma_i_matrix = np.diag(Sigma_i)
        
        # Initialize sum of angles for the current point
        sum_angles = 0
        
        # Iterate over each neighbor
        for index in indices:
            # Center the neighbor's points
            neighbor_indices = nbrs.kneighbors(X[index].reshape(1, -1), n_neighbors=k+1, return_distance=False)
            neighbor_indices = neighbor_indices[0][1:]
            M_j = X[neighbor_indices] - np.mean(X[neighbor_indices], axis=0)
            # Perform SVD
            U_j, Sigma_j, _ = svd(M_j, full_matrices=False)
            
            # Create diagonal matrix for Sigma_j
            Sigma_j_matrix = np.diag(Sigma_j)
            
            # Compute Q using the provided formula
            #Q = np.dot(np.dot(U_i, Sigma_i_matrix).T, np.dot(U_j, Sigma_j_matrix))
            Q = np.dot(U_i.T, U_j)
            
            # Perform SVD of Q
            _, Sigma_Q, _ = svd(Q)
            
            # Compute the angle between local subspaces
            angle = np.arccos(np.clip(np.sum(Sigma_Q) / (np.linalg.norm(Sigma_i) * np.linalg.norm(Sigma_j)), -1.0, 1.0))
            #angle = np.arccos(np.clip(np.sum(Sigma_Q) / np.sum(np.dot(Sigma_i_matrix.T, Sigma_j_matrix)), -1.0, 1.0))
            sum_angles += angle
        
        # Calculate the average curvature for the current point
        local_curvatures[i] = sum_angles / k

    # Calculate the overall curvature of the manifold
    overall_curvature = np.mean(local_curvatures)
    
    return overall_curvature, local_curvatures


# Example usage
# Suppose DATA is your data matrix, with the number of rows indicating the number of samples and the number of columns indicating the sample dimensions.
DATA = np.load(r"...\MLP feature\resnet50_cifar10\Linear_output_10\0.npy")
print(data.shape)
curvatures, _ = estimate_manifold_curvature(data, k=20)
print(curvatures)

鍥剧墖1

2. Volume metrics for perceptual manifolds in deep neural networks (ICLR 2023)

The volume of the perceptual manifold measures the richness of the distribution. See the paper, Delving into Semantic Scale Imbalance, for how to use multiscale volumetric metrics for perceptual manifolds, and for more conclusions.

The citation format is:

@inproceedings{
ma2023delving,
title={Delving into Semantic Scale Imbalance},
author={Yanbiao Ma and Licheng Jiao and Fang Liu and Yuxin Li and Shuyuan Yang and Xu Liu},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=07tc5kKRIo}
}

The following is the code to calculate the volume of the perceptual manifold.

import numpy as np

def calculate_volume(Z, d, Z_mean):
    # Calculate (Z - Z_mean)
    diff = Z - Z_mean

    # Calculate (Z - Z_mean)(Z - Z_mean)^T
    outer_product = np.dot(diff.T, diff)

    # Calculate \frac{d}{m}(Z - Z_mean)(Z - Z_mean)^T
    scaled_outer_product = (d / Z.shape[0]) * outer_product

    # Calculate I + \frac{d}{m}(Z - Z_mean)(Z - Z_mean)^T
    matrix_sum = np.eye(Z.shape[1]) + scaled_outer_product

    # Calculate \frac{1}{2} \log_2(I + \frac{d}{m}(Z - Z_mean)(Z - Z_mean)^T)
    volume = 0.5 * np.log2(np.linalg.det(matrix_sum))

    return volume

# Example usage
# Assume Z is a matrix of size 5000x10
Z = np.random.rand(5000, 10)
# Assume d is a hyperparameter
d = 1.0
# Calculate the mean Z_mean of Z
Z_mean = np.mean(Z, axis=0)

# Calculate the volume of the perceptual manifold
volume = calculate_volume(Z, d, Z_mean)
print("Perceptual manifold volume:", volume)

3. Intrinsic Dimensions for perceptual manifolds in deep neural networks (Submitted to TPAMI)

The intrinsic dimensionality of perceptual manifolds can predict the fairness of models. Specifically, the larger the intrinsic dimensionality of the perceptual manifold corresponding to a class, the poorer the model performs on that class. Below, we provide two estimation methods for intrinsic dimensionality.

image

See the paper, Unveiling and Mitigating Generalized Biases of DNNs through the Intrinsic Dimensions of Perceptual Manifolds

The citation format is:

@misc{ma2024unveiling,
      title={Unveiling and Mitigating Generalized Biases of DNNs through the Intrinsic Dimensions of Perceptual Manifolds}, 
      author={Yanbiao Ma and Licheng Jiao and Fang Liu and Lingling Li and Wenping Ma and Shuyuan Yang and Xu Liu and Puhua Chen},
      year={2024},
      eprint={2404.13859},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

(1) Estimation of intrinsic dimensions using TLE

from skdim.id import TLE
# 5000 denotes the sample number and 10 denotes the dimension.
data = np.ones(5000,10)
dim_estimator = TLE()
intrinsic_dim = dim_estimator.fit(data).dimension_
print("Intrinsic Dimensions:", intrinsic_dim)

(2) Covariance estimation methods for intrinsic dimensions

import numpy as np
# 5000 denotes the sample number and 10 denotes the dimension.
data = np.ones(5000,10)
intrinsic_dim = (np.trace(np.dot(data.T, data)))**2/np.trace(np.dot(data.T, data)**2)
print("Intrinsic Dimensions:", intrinsic_dim)

4. The geometric shape of the perceptual manifold in deep neural networks (IJCV 2024)

We found that if two categories are highly similar, the geometric shapes of their corresponding embedding distributions are also highly similar. This discovery demonstrates for the first time that the geometric shape of the perceptual manifold can also serve as prior knowledge to help rare categories recover their true distribution. For specific details, please refer to the paper: Geometric Prior Guided Feature Representation Learning for Long-Tailed Classification

figure3

The citation format is:

@article{ma2024geometric,
  title={Geometric Prior Guided Feature Representation Learning for Long-Tailed Classification},
  author={Ma, Yanbiao and Jiao, Licheng and Liu, Fang and Yang, Shuyuan and Liu, Xu and Chen, Puhua},
  journal={International Journal of Computer Vision},
  pages={1--18},
  year={2024},
  publisher={Springer}
}

The following code calculates the similarity of the geometric shapes between perceptual manifolds.

import matplotlib.pyplot as plt
import numpy as np

# Assume the number of samples is 5000, and each sample is a (1, 10) vector.
data_matrix1 = np.random.rand(5000, 10)
data_matrix2 = np.random.rand(5000, 10)

# Calculate the covariance matrix
covariance_matrix1 = np.cov(data_matrix1, rowvar=False)

# Perform eigenvalue decomposition on the covariance matrix
eigenvalues1, eigenvectors1 = np.linalg.eigh(covariance_matrix1)

# Sort the eigenvalues
sorted_indices = np.argsort(eigenvalues1)[::-1]
eigenvalues1 = eigenvalues1[sorted_indices]
eigenvectors1 = eigenvectors1[:, sorted_indices]


# Calculate the covariance matrix
covariance_matrix2 = np.cov(data_matrix2, rowvar=False)

# Perform eigenvalue decomposition on the covariance matrix
eigenvalues2, eigenvectors2 = np.linalg.eigh(covariance_matrix2)

# Sort the eigenvalues
sorted_indices = np.argsort(eigenvalues2)[::-1]
eigenvalues2 = eigenvalues2[sorted_indices]
eigenvectors2 = eigenvectors2[:, sorted_indices]

similarity = 0
for i in range(len(eigenvectors2)):
    similarity += np.abs(np.dot(eigenvectors1[:,i].T,eigenvectors2[:,i]))

print("Similarity of the geometric shapes of the two perceptual manifolds:", similarity)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perceptual_manifold_geometry-0.1.3.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file perceptual_manifold_geometry-0.1.3.tar.gz.

File metadata

File hashes

Hashes for perceptual_manifold_geometry-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e894761c6bf039bc1b6ac8ee1a4ce1f8969ddbf707195cde9f9276bcacacc8a6
MD5 c58e5bc2288d1a9101347884b5d1911e
BLAKE2b-256 61400549597a599b5f971a4eb8348583e3a38468f4d2503d43eff99adce2f7ae

See more details on using hashes here.

File details

Details for the file perceptual_manifold_geometry-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for perceptual_manifold_geometry-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 25a4225f669ff03b02f6f06e5b927ae117d55c2452c9db9bc57ff8229ced8e69
MD5 75de705152e1142d389f6ae690eba6a8
BLAKE2b-256 51cfb6ff9b7677db0cbf0afe8ccbae2530c4d7df2b45fe6524d286358ae71e85

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page