A package for performing linear algebra operations on the MNIST dataset.
Project description
Linear Algebra
Introduction
This project explores the application of Singular Value Decomposition (SVD), Principal Component Analysis (PCA), and Pixel Intensity Sum (PIS) on the MNIST dataset. SVD and PCA are utilized for dimensionality reduction, providing insights into data compression and feature extraction. The project demonstrates how these techniques can significantly reduce the complexity of data while retaining essential features necessary for tasks like image reconstruction and classification. Additionally, PIS predictions via linear regression showcase the relationship between reduced dimensions and image characteristics. This comprehensive analysis not only highlights the effectiveness of SVD and PCA in handling high-dimensional data but also underscores the potential of simple linear models in making meaningful predictions from complex datasets.
Below is the directory structure:
Singular Value Decomposition (SVD)
The code from svd.py performs the following operations:
-
Loads the MNIST Dataset: Utilizes PyTorch's datasets and transforms to load the MNIST dataset, normalizing the images.
-
Applies Singular Value Decomposition (SVD): Converts images to NumPy arrays, centers them by subtracting the mean, and then applies SVD to decompose the images, retaining only the top k components for dimensionality reduction.
-
Reconstructs Images: Uses the top k components from SVD to reconstruct images.
-
Visual Comparison: Plots and compares the first original image from the MNIST dataset against its reconstructed version after applying SVD, facilitating a visual assessment of the reconstruction quality.
Principal value Decomposition (PCA)
The code from eigen_VV.py performs the following operations:
-
Loads the MNIST dataset, normalizing and flattening the images for processing.
-
Computes PCA to center the data, calculate the covariance matrix, and determine the eigenvalues and eigenvectors, which are then sorted by the magnitude of eigenvalues.
-
Plots the first few eigenfaces by reshaping and displaying the leading eigenvectors as images, illustrating the principal components that capture the most variance in the dataset.
pixel intensity Sum (PIS)
The code from linear_regression.py performs the following operations:
-
Loads and preprocesses the MNIST dataset, normalizing images and flattening them into vectors.
-
Calculates pixel intensity sums for each image as a target variable for regression.
-
Applies PCA to reduce the dimensionality of the dataset, retaining 50% of the variance.
-
Splits the dataset into training and testing sets for model validation.
-
Trains a linear regression model to predict pixel intensity sums based on PCA-reduced features.
-
Visualizes predictions using a bar chart to compare actual vs. predicted pixel intensity sums for a subset of images.
The Main Function
The main.py script orchestrates the application of Singular Value Decomposition (SVD), Principal Component Analysis (PCA), and Pixel Intensity Sum prediction (PIS) on the MNIST dataset.
-
SVD Operation (run_svd): Loads MNIST images, applies SVD to decompose and then reconstruct the images, and visualizes the original vs. reconstructed images.
-
PCA Operation (run_pca): Loads MNIST images, computes PCA to extract eigenvalues and eigenvectors (principal components), prints the first 10 eigenvalues, and visualizes the "eigenfaces."
-
Pixel Intensity Sum Prediction (run_pis): Preprocesses MNIST images by flattening, applies PCA for dimensionality reduction, uses linear regression to predict the total pixel intensity sums, and plots the actual vs. predicted sums.
-
Argument Parsing: Enables the user to specify which operation to perform (svd, pca, or pis) via command-line arguments.
This structure allows for modular exploration of different machine learning techniques on image data, demonstrating dimensionality reduction, reconstruction, and regression analysis within a unified framework.
Run Locally
Clone the project
git clone https://github.com/SammyIJ/WASORIA.git
Go to the project directory
cd WASORIA/MNIST_Linear_Algebra
Run the main and make sure to parse other arguments
python main.py --run svd
python main.py --run pca
python main.py --run pis
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file MNIST_Linear_Algebra-0.1.0.tar.gz.
File metadata
- Download URL: MNIST_Linear_Algebra-0.1.0.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eeac82d988afd507297c8dc4485343388730bb3be956e0f392439deb33a42f21
|
|
| MD5 |
222bc0e1d423f99155aee0158823904b
|
|
| BLAKE2b-256 |
d2f39847c9655aa0a380da43ece1f8ee8af9a063b21570820a7deb0eea7a22c6
|
File details
Details for the file MNIST_Linear_Algebra-0.1.0-py3-none-any.whl.
File metadata
- Download URL: MNIST_Linear_Algebra-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4315722cdafaaab085a13b48c27e75188bb68879e09f009ac9607a56eeb2ebb
|
|
| MD5 |
fbfc005a40e60600a9f0f53d18af0699
|
|
| BLAKE2b-256 |
8abb3dc37ad43aecf6427e45ff146424da7f0d8348db4a58e437a41d7aa60419
|