embedding-based item nearest neighborhoods extraction
Project description
DeepNeighbor
Embedding-based Retrieval for ANN Search and Recommendations!
View Demo
·
Report Bug
·
Request Feature
DeepNeighbor is a High-level,Flexible and Extendible package for embedding-based information retrieval from user-item interaction logs. Just as the name suggested, 'deep' means deep learning models to get user/item embeddings, while 'neighbor' means approximate nearest neighbor search in the embedding space.
It mainly has two parts : Embed step and Search step by the following codes:
model = Embed(data_path); model.train()
,which generates embeddings for users and items (Deep),
model.search()
, which looks for Approximate nearest neighbor for seed user/item (Neighbor) .
Install
pip install deepneighbor
How To Use
from deepneighbor import Embed
model = Embed(data,model='gat')
model.train()
model.search(seed = 'Louis', k=10)
Input format
The input data for the Embed() should be a (*.csv or *.txt ) file path (e.g. '\data\data.csv')with two columns in order: 'user' and 'item'. For each user, the item are recommended to be ordered by time.
Models & parameters in Embed()
- Word2Vec
w2v
- Factorization Machines
fm
- Deep Semantic Similarity Model
- Siamese Network with triple loss
- Deepwalk
- Graph convolutional network
- Neural Graph Collaborative Filtering algorithm
ngcf
- Matrix factorization
mf
- Graph attention network
gat
Model Parameters
deepwalk
model = Embed(data, model = 'deepwalk')
model.train(window_size=5,
workers=1,
iter=1
dimensions=128)
window_size
Skip-gram window size.workers
Use these many worker threads to train the model (=faster training with multicore machines).iter
Number of iterations (epochs) over the corpus.dimensions
Dimensions for the node embeddings
graph attention network
model = Embed(data, model = 'gat')
model.train(window_size=5,
learning_rate=0.01,
epochs = 10,
dimensions = 128,
num_of_walks=80,
beta=0.5,
gamma=0.5,)
window_size
Skip-gram window size.learning_rate
learning rate for optimizing graph attention networkepochs
Number of gradient descent iterations.dimensions
Dimensions for the embeddings for each node (user/item)num_of_walks
Number of random walks.beta
andgamma
Regularization parameter.
How To Search
model.search(seed, k)
seed
The Driver for the algorithmsk
Number of Nearest Neighbors.
Examples
Open Colab to run the example with facebook data.
Contact
Please contact louiswang524@gmail.com for collaboration or providing feedbacks.
License
This project is under MIT License, please see here for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for deepneighbor-0.3.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a3b9ffa7143e80849c79c982d22d81bfa48ab6e0c911ac191d813fb1c3b3bae |
|
MD5 | 71e2a0f52b78ba439a83f1026b2d809e |
|
BLAKE2b-256 | 6a05a2d64451010d2b1e1085b9f3403ffc419b5e0579d68e17d54bdb05e75c7f |