Matrix completion with column subset selection.
Project description
CSMC
CSMC is a Python library for performing column subset selection in matrix completion tasks. It provides an implementation of the CSSMC method, which aims to complete missing entries in a matrix using a subset of columns.
Columns Selected Matrix Completion (CSMC) is a two-stage approach for low-rank matrix recovery. In the first stage, CSMC samples columns of the input matrix and recovers a smaller column submatrix. In the second stage, it solves a least squares problem to reconstruct the whole matrix.
CSMC supports numpy arrays and pytorch tensors.
Installation
You can install CSMC using pip:
pip install -i csmc
Usage
- Generate random data
import numpy as np
import random
n_rows = 50
n_cols = 250
rank = 3
x = np.random.default_rng().normal(size=(n_rows, rank))
y = np.random.default_rng().normal(size=(rank, n_cols))
M = np.dot(x, y)
M_incomplete = np.copy(M)
num_missing_elements = int(0.7 * M.size)
indices_to_zero = random.sample(range(M.size), k=num_missing_elements)
rows, cols = np.unravel_index(indices_to_zero, M.shape)
M_incomplete[rows, cols] = np.nan
- Fill with CSNN algorithm
from csmc import CSMC
solver = CSMC(M_incomplete, col_number=100)
M_filled = solver.fit_transform(M_incomplete)
- Fill with Nuclear Norm Minimization with SDP (NN algorithm)
from csmc import NuclearNormMin
solver = NuclearNormMin(M_incomplete)
M_filled = solver.fit_transform(M_incomplete, np.isnan(M_incomplete))
- Fill with Frank-Wolfe (Conditional Gradient Method)
from csmc import CGM
solver = CGM(M_incomplete)
M_filled = solver.fit_transform(M_incomplete, np.isnan(M_incomplete))
Algorithms
NuclearNormMin: Matrix completion by SDP (NN algorithm) Exact Matrix Completion via Convex OptimizationCSNN: Matrix completion by CSNNPGD: Nuclear norm minimization using Proximal Gradient Descent (PGD) Spectral Regularization Algorithms for Learning Large Incomplete Matrices by Mazumder et. al.CSPGD: Matrix completion by CSPGDCGM: Matrix completion with Frank-Wolfe methodSGD: Scaled Gradient Descent (SGD) [Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent(https://jmlr.org/papers/volume22/20-1067/20-1067.pdf) by Tong et. al.SVP: Singular Value Projection ( SVP) Guaranteed Rank Minimization via Singular Value Projection by Meka et. al.MC2: MC2 MC2: a two-phase algorithm for leveraged matrix completion by Eftekhari et. al.CUR+Nuc: CUR+Nuc based on CUR+ for uniformly sampled entries CUR Algorithm for Partially Observed Matrices by Xu et. al.
Examples
Configuration
To adjust the number of threads used for intraop parallelism on CPU, modify variable:
NUM_THREADS = 8
in settings.py
Citation
Krajewska, A., Niewiadomska-Szynkiewicz E. (2024). Randomized Approach to Matrix Completion: Applications in Recommendation Systems and Image Inpainting.
Krajewska, A. , Niewiadomska-Szynkiewicz E., "Efficient Data Completion and Augmentation." 2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2024.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file csmc-2.0.0.tar.gz.
File metadata
- Download URL: csmc-2.0.0.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.12 CPython/3.10.12 Linux/6.8.0-85-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33e694034d416beeeff2da6683dd09b85e770ce3346490c509aceff63520062c
|
|
| MD5 |
e0a2559324b445a72f64af00dad80996
|
|
| BLAKE2b-256 |
431eb1906e69ddc23b1b0214c48bf5c742352995452ee4d8cc9de42b345cfd01
|
File details
Details for the file csmc-2.0.0-py3-none-any.whl.
File metadata
- Download URL: csmc-2.0.0-py3-none-any.whl
- Upload date:
- Size: 19.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.12 CPython/3.10.12 Linux/6.8.0-85-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f050d4c65606610e1a8f70667b540acc24bbf61edb7195cf527fa39b58e74bb4
|
|
| MD5 |
04a11f607b8fd220def319eae87b074b
|
|
| BLAKE2b-256 |
bd1399cbb78a2c1b8ab1b1f646ee3e0cd7d5932e990cd002944893bff2f5b0e9
|