Package for creating rank-based interpretable and contextual embeddings.

These details have not been verified by PyPI

Project description

interpretable-embeddings

interpretable-embeddings is the official implementation of RaDE, RaDE+, and GRaCE algorithms for generating rank-based interpretable and contextual embeddings from top-K similarity lists.

This package implements the RaDE, RaDE+, and GRaCE algorithms, which use graph-based measures to create interpretable, effective, and unsupervised embeddings for retrieval, clustering, classification, and visualization.

🔍 Overview

Unlike traditional embedding techniques that require raw features or supervised training, this package builds representations entirely from ranked similarity lists (e.g., from a kNN graph or retrieval system). Each embedding dimension corresponds to a similarity measure with the "leaders" (reference node).

Key benefits:

Unsupervised: No labels or ground truth needed.
Interpretable: Embedding dimensions are human-understandable.
Versatile: Works for text, images, graphs—any domain with top-K similarities.

📦 Installation

pip install interpretable-embeddings

Dependencies:

numpy
tqdm

Requires Python ≥ 3.7.

🧠 Algorithms

RaDE (Rank-based Diffusion Embedding)

Selects leaders by propagating rank-based affinities through a diffusion process. Each embedding dimension encodes the affinity to a single representative node.

RaDE+ (Multi-Representative RaDE)

Extends RaDE by expanding each representative node into an expansion set of similar nodes (Algorithm 1 from the paper). Each embedding dimension aggregates affinities over the expansion set with linearly decreasing weights, producing more robust and diversified representations.

GRaCE (Graph and Rank-based Contextual Embeddings)

Extends RaDE with unsupervised effectiveness estimation (e.g., Reciprocal Density, Accumulated JacMax) and rank correlation measures (e.g., Reciprocal Distance, JacMax).

🛠 Usage

Input Format

Your input must be a .txt file with one ranked list per line (space-separated item IDs):

15 3 8 22 7 9 ...
3 2 11 5 6 ...
...

Each line is a query, and each number is a retrieved item.

RaDE Example

from interpretable_embeddings import RaDE

# Initialize
rade = RaDE(rks_path="data/ranked_lists.txt", rks_size_L=20)

# Compute internal structure
rade.fit(num_candidates=1000, num_leaders=128, t=2)

# Get embedding vectors
embeddings = rade.transform()

# Or do both in one call
embeddings = rade.fit_transform(num_candidates=1000, num_leaders=128, t=2)

RaDE+ Example

from interpretable_embeddings import RaDEPlus

rade_plus = RaDEPlus(rks_path="data/ranked_lists.txt", rks_size_L=20)

# Compute internal structure
rade_plus.fit(num_candidates=1000, num_leaders=128, t=2, m=3)

# Get embedding vectors
embeddings = rade_plus.transform()

# Or do both in one call
embeddings = rade_plus.fit_transform(num_candidates=1000, num_leaders=128, t=2, m=3)

Parameters:

num_candidates: size of the candidate pool (top-k nodes by reciprocal affinity).
num_leaders: embedding dimensionality (number of representative nodes).
t: diffusion steps for the transition matrix A = W^t.
m: number of nodes added to each leader's expansion set. Constraint: m * num_leaders ≤ num_candidates.

GRaCE Example

from interpretable_embeddings import GRaCE

grace = GRaCE(
    rks_path="data/ranked_lists.txt",
    top_K=20,
    correlation_measure="jacmax",  # or "reciprocal"
    estimation_measure="reciprocal_density",  # or "accjacmax"
    alpha=0.95
)

# Compute internal structure
grace.fit(num_leaders=128)

# Get embedding vectors
embeddings = grace.transform()

# Or do both in one call
embeddings = grace.fit_transform(num_leaders=128)

🔬 Example Applications

Retrieval

from sklearn.metrics.pairwise import cosine_similarity

query_idx = 0
sims = cosine_similarity(embeddings)
top_k = sims[query_idx].argsort()[::-1][1:11]
print("Top-10 results for query:", top_k)

Clustering

from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=5).fit(embeddings)
print(kmeans.labels_)

Classification

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(embeddings, labels)
clf = LogisticRegression(max_iter=1000).fit(X_train, y_train)
print("Accuracy:", clf.score(X_test, y_test))

Visualization

from sklearn.manifold import TSNE
import matplotlib.pyplot as plt

proj = TSNE(n_components=2).fit_transform(embeddings)
plt.scatter(proj[:, 0], proj[:, 1], c=labels, cmap="tab10")
plt.title("2D Visualization of RaDE Embeddings")
plt.show()

📁 Package Structure

interpretable-embeddings/
│
├── rade.py                 # RaDE implementation
├── rade_plus.py            # RaDE+ implementation (multi-representative)
├── grace.py                # GRaCE implementation
├── utils.py                # Ranked list reader
└── measures/
    ├── qpp.py              # Query performance prediction measures (AccJacMax, Reciprocal Density)
    └── correlation.py      # Rank correlation measures (JacMax, Reciprocal KNN)

📚 Citation

If you use this package in your research, please cite:

GRaCE

Almeida, T. C. C., Letício, G. R., Valem, L. P., Freitas, A., Pedronette, D. C. G. Effective Graph and Rank-based Contextual Embeddings for Textual and Multimedia Data 2025 International Joint Conference on Neural Networks (IJCNN), Rome – Italy.

RaDE / RaDE+

De Fernando, F. A., Pedronette, D. C. G., De Sousa, G. J., Valem, L. P., Guilherme, I. R. RaDE+: A semantic rank-based graph embedding algorithm International Journal of Information Management Data Insights, 2(2), 100078, 2022.

De Fernando, F. A., Pedronette, D. C. G., De Sousa, G. J., Valem, L. P., Guilherme, I. R. RaDE: A Rank-based Graph Embedding Approach 15th International Conference on Computer Vision Theory and Applications (VISAPP), 2020.

🤝 Contact

Thiago César Castilho Almeida: tc.almeida@unesp.br
Lucas Pascotti Valem: lucaspascottivalem@gmail.com
Daniel Carlos Guimarães Pedronette: pedronette@gmail.com

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.1.1

Apr 13, 2026

1.1.0

Apr 6, 2026

1.0.1

Jan 22, 2026

0.0.1

Jan 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

interpretable_embeddings-1.1.1.tar.gz (20.2 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

interpretable_embeddings-1.1.1-py3-none-any.whl (22.1 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file interpretable_embeddings-1.1.1.tar.gz.

File metadata

Download URL: interpretable_embeddings-1.1.1.tar.gz
Upload date: Apr 13, 2026
Size: 20.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for interpretable_embeddings-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`129ba47d86e39f65e8778c8001e0619dae726e19c45006e14261389fdf4bc680`
MD5	`4d1e16c6c143517fd5a08d0a538917ec`
BLAKE2b-256	`aa33f7d8b05e062fa44668f09b4a3384bfc76a6a4a18e3dcab67da347b3d411c`

See more details on using hashes here.

File details

Details for the file interpretable_embeddings-1.1.1-py3-none-any.whl.

File metadata

Download URL: interpretable_embeddings-1.1.1-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 22.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for interpretable_embeddings-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eef6da23c01890def743d50053a5dd92b8ed96c5d5c23fc2f65c74f2db3bf7d9`
MD5	`1d395d0f3bfb1c8eba1d46957ce7e85a`
BLAKE2b-256	`7fc132f5684f0c579d281a6759301ee98e6c8ed09d859cb2a19ea5ee37cc427d`

See more details on using hashes here.

interpretable-embeddings 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

interpretable-embeddings

🔍 Overview

📦 Installation

🧠 Algorithms

RaDE (Rank-based Diffusion Embedding)

RaDE+ (Multi-Representative RaDE)

GRaCE (Graph and Rank-based Contextual Embeddings)

🛠 Usage

Input Format

RaDE Example

RaDE+ Example

GRaCE Example

🔬 Example Applications

Retrieval

Clustering

Classification

Visualization

📁 Package Structure

📚 Citation

GRaCE

RaDE / RaDE+

🤝 Contact

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes