TensorClus is a Python package for clustering of three-way tensor data
Project description
TensorClus
TensorClus (Tensor Clustering) is a first Python library aiming to clustering and co-clustering of tensor data. It allows to easily perform tensor clustering trought decomposition or tensor learning and tensor algebra. TensorClus allows easy interaction with other python packages such as NumPy, Tensorly, TensorFlow or TensorD, and run methods at scale on CPU or GPU.
It supports major operating systems namely Microsoft Windows, macOS, and Ubuntu.
- Source-code: https://github.com/boutalbi/TensorClus
- Jupyter Notebooks: https://github.com/boutalbi/TensorClus/blob/master/demo_tensorClus.ipynb
Brief description
TensorClus library provides multiple functionalities:
- Several datasets
- Tensor co-clustering with various data type
- Tensor decomposition and clustering
- Visualization
Requirements
numpy==1.18.3
pandas==1.0.3
scipy==1.4.1
matplotlib==3.0.3
scikit-learn==0.22.2.post1
coclust==0.2.1
tensorD==0.1
tensorflow==2.3.0
tensorflow-gpu==2.3.0
tensorflow-estimator==2.3.0
tensorly==0.4.5
Installing TensorClus
For installing TensorClus package use the following command
pip install -U TensorClus
To clone TensorClus project from github
# Install git LFS via https://www.atlassian.com/git/tutorials/git-lfs
# initialize Git LFS
git lfs install Git LFS initialized.
git init Initialized
# clone the repository
git clone https://github.com/boutalbi/TensorClus.git
cd TensorClus
# Install in editable mode with `-e` or, equivalently, `--editable`
pip install -e .
For more details about TensorClus, see Documentation.
License
TensorClus is released under the MIT License (refer to LISENSE file for details).
Examples
import TensorClus.coclustering.sparseTensorCoclustering as tcSCoP
from TensorClus.reader import load
import numpy as np
from coclust.evaluation.external import accuracy
##################################################################
# Load DBLP1 dataset #
##################################################################
data_v2, labels, slices = load.load_dataset("DBLP1_dataset")
n = data_v2.shape[0]
##################################################################
# Execute TSPLBM on the dataset #
##################################################################
# Define the number of clusters K
K = 3
# Optional: initialization of rows and columns partitions
z=np.zeros((n,K))
z_a=np.random.randint(K,size=n)
z=np.zeros((n,K))+ 1.e-9
z[np.arange(n) , z_a]=1
w=np.asarray(z)
# Run TSPLBM
model = tcSCoP.SparseTensorCoclusteringPoisson(n_clusters=K , fuzzy = True,init_row=z, init_col=w,max_iter=50)
model.fit(data_v2)
predicted_row_labels = model.row_labels_
predicted_column_labels = model.column_labels_
acc = np.around(accuracy(labels, predicted_row_labels),3)
print("Accuracy : ", acc)
Datasets
The following datasets and their description are available in Google Drive.
Citing
If you use TensorClus in an academic paper, please cite
@article{boutalbi2020tensor,
title={Tensor latent block model for co-clustering},
author={Boutalbi, Rafika and Labiod, Lazhar and Nadif, Mohamed},
journal={International Journal of Data Science and Analytics},
pages={1--15},
year={2020},
publisher={Springer},
doi= {10.1007/s41060-020-00205-5},
url= "https://link.springer.com/article/10.1007/s41060-020-00205-5"
}
References
[1] Boutalbi, R., Labiod, L., & Nadif, M. (2020). Tensor latent block model for co-clustering. International Journal of Data Science and Analytics, 1-15.
[2] Boutalbi, R., Labiod, L., & Nadif, M. (2019, July). Sparse Tensor Co-clustering as a Tool for Document Categorization. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1157-1160).
[3] Boutalbi, R., Labiod, L., & Nadif, M. (2019, April). Co-clustering from Tensor Data. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 370-383). Springer.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file TensorClus-0.0.1.tar.gz
.
File metadata
- Download URL: TensorClus-0.0.1.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7af142e5ba1b7f22429ba644bdffe025668e30a4769786cc7ae91ffca7ee9c3 |
|
MD5 | d984a0c64a020e9230a66c4ca7d42fc6 |
|
BLAKE2b-256 | ff79717f010ea8db80dcf8fbe2f97f7a5dc4e02aad7c26aeb67e49d463d1ddca |
File details
Details for the file TensorClus-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: TensorClus-0.0.1-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cacdb18997c7523ab0b734f602b159c38b2674d310ff1260c485295454efc8ff |
|
MD5 | d76dd3483d77806a0fd2ec8d47793939 |
|
BLAKE2b-256 | b42c58723a05fb5b2867818ca56fb52f55ca16b533e696f3f0d73b83f8395175 |