SVD, PCA, and matrix decomposition from first principles
Project description
pyspectral
SVD, PCA, and matrix decomposition — implemented from scratch in Python.
No np.linalg.svd. No sklearn.decomposition.PCA. Every algorithm is built from first principles using only NumPy and SciPy as a numerical substrate.
What it does
Starting from power iteration on a random vector, the library builds up a full stack:
power iteration
→ eigen decomposition (Hotelling deflation)
→ full SVD
→ truncated SVD (exact + randomized)
→ PCA
→ image compression
→ video compression
→ eigenfaces (face recognition)
It's a learning/research library. The goal is algorithmic clarity, not LAPACK performance.
For the full writeup on how it was built and why — read the blog.
Install
pip install -e .
pip install -e ".[video]" # video compression (imageio)
pip install -e ".[notebook]" # Jupyter support
pip install -e ".[dev]" # pytest
Structure
pyspectral/
linalg_core/ power_iteration.py, eigen_decomp.py, svd_full.py, svd_truncated.py
pca/ covariance.py, pca_model.py
applications/ image_compression.py, eigenfaces.py, video_compression.py
benchmarking/ time_complexity.py, memory_profile.py
experiments/ large_matrix_tests.py
utils/ matrix_checks.py
tests/ test_power_iteration.py, test_svd.py, test_pca.py
API
Full SVD
import numpy as np
from pyspectral.linalg_core.svd_full import compute_svd
A = np.random.randn(100, 80)
U, S, Vt = compute_svd(A)
A_rec = U[:, :len(S)] @ np.diag(S) @ Vt[:len(S), :]
Truncated SVD
from pyspectral.linalg_core.svd_truncated import truncated_svd
Uk, Sk, Vkt = truncated_svd(A, k=10) # exact
Uk, Sk, Vkt = truncated_svd(A, k=10, method="randomized") # Halko 2011, much faster
A_approx = Uk @ np.diag(Sk) @ Vkt
PCA
from pyspectral.pca.pca_model import PCAEngine
pca = PCAEngine(n_components=10)
pca.fit(X) # X: (n_samples, n_features)
Z = pca.transform(X) # -> (n, 10)
Xhat = pca.inverse_transform(Z) # -> (n, n_features)
print(pca.explained_variance_ratio_)
k, _ = pca.explained_variance(threshold=0.90)
print(f"{k} components explain 90% variance")
Use method="svd" when features >> samples (avoids building the p×p covariance matrix):
pca = PCAEngine(n_components=50, method="svd")
Image compression
from pyspectral.applications.image_compression import ImageCompressor
c = ImageCompressor("photo.png")
c.compress_and_compare(ranks=[5, 20, 50, 100])
c.print_report()
c.save_results("output/")
Output:
Rank k Ratio PSNR (dB)
5 55.2x 17.12
20 13.8x 22.28
50 5.5x 27.17
100 2.8x 31.98
SVD compression is not competitive with JPEG at equal file size (~30 dB gap). It stores raw float32 with no entropy coding. Useful for scientific/numerical matrices — not for replacing image codecs.
Video compression
from pyspectral.applications.video_compression import VideoCompressor
vc = VideoCompressor()
vc.load_synthetic(kind="wave") # "wave", "bouncing", "noise", "mixed"
vc.compress_frame_by_frame(k=5) # rank-k SVD per frame
vc.compress_temporal(k=5) # global SVD across all frames (T, H*W)
vc.save_results("output/video/")
Eigenfaces
from pyspectral.applications.eigenfaces import EigenfacesModel
model = EigenfacesModel(n_components=30, image_size=(32, 32))
model.train_synthetic(n_subjects=10, n_images_per_subject=8)
label, dist, _ = model.recognize_face(query_image)
accuracy, _ = model.evaluate_accuracy() # leave-one-out CV
model.visualize_eigenfaces(n_show=16, output_path="eigenfaces.png")
Real face dataset:
model = EigenfacesModel(n_components=50, image_size=(112, 92))
model.train_eigenfaces("path/to/dataset/") # one subfolder per person
Matrix utilities
from pyspectral.utils.matrix_checks import matrix_info, condition_number, is_symmetric
matrix_info(A) # shape, norms, rank, condition number
condition_number(A) # sigma_max / sigma_min
is_symmetric(A)
Running demos and tests
python -m pyspectral.applications.image_compression
python -m pyspectral.applications.eigenfaces
python -m pyspectral.applications.video_compression
python -m pyspectral.benchmarking.time_complexity
python -m pyspectral.experiments.large_matrix_tests
python -m pytest pyspectral/tests/ -p no:asyncio -q
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scipyspectral-1.0.0.tar.gz.
File metadata
- Download URL: scipyspectral-1.0.0.tar.gz
- Upload date:
- Size: 48.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4670b5ebd80dd921c9fb9c7f74479f9afb336fc1a7586599b7a7397a79f9f61e
|
|
| MD5 |
fceeacba0bba5ef172a2a234aab94953
|
|
| BLAKE2b-256 |
472f19b6e44d55c7eba8fb1846983e4c51afa80071d6a67f1e38759298c21cbc
|
File details
Details for the file scipyspectral-1.0.0-py3-none-any.whl.
File metadata
- Download URL: scipyspectral-1.0.0-py3-none-any.whl
- Upload date:
- Size: 52.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
720342c69ad0069ef11af6c300801f5475e3f4a1e20caadd713db235c8df7f81
|
|
| MD5 |
e6ed8574753c318afb5770198cba44ce
|
|
| BLAKE2b-256 |
eccdbf67cdcde564924314373205174b8c6e00e8562f0f74bed8ebcf6654df9d
|