Skip to main content

Package for Joint Embedding-classifier Learning for Interpretability. Learns feature/item/user embeddings with specific structures, recommends new item-user associations and provides feature importance scores.

Project description

funding logo

Joint Embedding-classifier Learning for improved Interpretability (JELI) Python Package

This repository is a part of the EU-funded RECeSS project (#101102016), and hosts the code for the open-source Python package JELI for the collaborative filtering approach.

Python Version DOI License: MIT Build Status Codecov Codefactor

Statement of need

Interpretability is a topical question in recommender systems, especially in healthcare applications. In drug repurposing, the goal is to identify novel therapeutic indications as drug-disease pairs. An interpretable drug repurposing algorithm quantifies the importance of each input feature for the predicted therapeutic drug-disease association in a non-ambiguous fashion, using post hoc methods. Unfortunately, different importance score-based approaches lead to different results, yielding unreliable interpretations.

We introduce the novel Joint Embedding Learning-classifier for improved Interpretability (JELI). It features a new structured recommender system and trains it jointly on a drug-disease-gene knowledge graph completion task. In particular, JELI simultaneously (a) learns the gene, drug, and disease embeddings; (b) predicts new drug-disease associations based on those embeddings; (c) provides importance scores for each gene. The drug and disease embeddings have a structure that depends on the gene embeddings. Therefore, JELI allows the introduction of graph-based priors on the connections between diseases, drugs, and genes in a generic fashion to recommend and argue for novel therapeutic drug-disease associations. 

Contrary to prior works, the recommender system explicitly includes the importance scores, strengthening the link between the recommendations and the extracted scores while allowing the use of a generic embedding model. The recommendation strategy in JELI can also be readily applied beyond the task of drug repurposing for any sets of items, users, and features.

Install the latest release

Using pip

pip install jeli

Docker

#Build Docker image
docker build -t jeli .
#Run Docker image built in previous step and drop into SSH
docker run -it --expose 3000  -p 3000:3000 jeli

Dependencies

OS: developed and tested on Debian Linux.

The complete list of dependencies for JELI can be found at requirements.txt (pip).

Usage

from jeli.JELI import JELI

from stanscofi.utils import load_dataset
from stanscofi.training_testing import random_simple_split
import pandas as pd

## loads the Gottlieb drug repurposing data set
data_args = load_dataset("Gottlieb", "./")
dataset = Dataset(**data_args)

## splits in training and testing sets without leakage
(train_folds, test_folds), _ = random_simple_split(dataset, 0.2, random_state=1234)
train = dataset.subset(train_folds)
test = dataset.subset(test_folds)

classifier = JELI({"cuda_on": False, "n_dimensions": 10, "random_state": 1234, "epochs": 25})

## trains JELI on the training set
classifier.fit(train)

## predicts on the testing set
scores = classifier.predict_proba(test)
classifier.print_scores(scores)
predictions = classifier.predict(scores, threshold=0.5)
classifier.print_classification(predictions)

## computes an embedding i (item/drug)
item = pd.DataFrame(dataset.items.toarray()[:,0],index=dataset.item_features,columns=["0"])
i = model.transform(item, is_item=True)

## computes an embedding u (user/disease)
user = pd.DataFrame(dataset.users.toarray()[:,0],index=dataset.user_features,columns=["0"])
u = model.transform(user, is_item=False)

## computes the feature-wise importance scores from embeddings
embs = classifier.model["feature_embeddings"]
feature_scores = embs.sum(axis=1)

Licence

This repository is under an OSI-approved MIT license.

Citation

If you use JELI in academic research, please cite it as follows

Clémence Réda, Jill-Jênn Vie, Olaf Wolkenhauer. Joint Embedding-Classifier Learning for Interpretable Collaborative Filtering. 2024. hal-04625183.

Community guidelines with respect to contributions, issue reporting, and support

Pull requests and issue flagging are welcome, and can be made through the GitHub interface. Support can be provided by reaching out to recess-project[at]proton.me. However, please note that contributors and users must abide by the Code of Conduct.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jeli-1.0.2.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jeli-1.0.2-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file jeli-1.0.2.tar.gz.

File metadata

  • Download URL: jeli-1.0.2.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for jeli-1.0.2.tar.gz
Algorithm Hash digest
SHA256 9d32725d687ffded60cb960e665fba4de2573c41a485e5beca8e93f2b220e699
MD5 c3d3ab75a5cc8bf38108f5c64f94a17e
BLAKE2b-256 9d41c56823353f7bbc8169e6b52c81f684474bd3b967e69924cce0df534f3f50

See more details on using hashes here.

File details

Details for the file jeli-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: jeli-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for jeli-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f01ea6d4e5f8bef9560aa27d728f9eac9ed136085a55735b37425e66372e94c4
MD5 2f71eff558481ce5dee42e56e413dcf1
BLAKE2b-256 583a1c489415f875de7188f0922aa0d0ac0923530abc4d202fd7ce3fb67d8f47

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page