Skip to main content

A comprehensive recommendation library with match, ranking, and multi-task learning models

Project description

NextRec

Python PyTorch License Version

中文版

A Unified, Efficient, and Scalable Recommendation System Framework

Introduction

NextRec is a modern recommendation system framework built on PyTorch, providing a unified modeling, training, and evaluation experience for researchers and engineering teams. The framework adopts a modular design with rich built-in model implementations, data-processing tools, and production-ready training components, enabling quick coverage of multiple recommendation scenarios.

This project draws on several open-source recommendation libraries, with the general layers referencing the mature implementations in torch-rechub. These part of codes is still in its early stage and is being gradually replaced with our own implementations. If you find any bugs, please submit them in the issue section. Contributions are welcome.

Key Features

  • Multi-scenario Recommendation: Supports ranking (CTR/CVR), retrieval, multi-task learning, and generative recommendation models such as TIGER and HSTU — with more models continuously added.
  • Unified Feature Engineering & Data Pipeline: Provides Dense/Sparse/Sequence feature definitions, persistent DataProcessor, and optimized RecDataLoader, forming a complete “Define → Process → Load” workflow.
  • Efficient Training & Evaluation: A standardized training engine with optimizers, LR schedulers, early stopping, checkpoints, and logging — ready out-of-the-box.
  • Developer-friendly Engineering Experience: Modular and extensible design, full tutorial support, GPU/MPS acceleration, and visualization tools.

Installation

NextRec supports installation via UV or traditional pip/source installation.

Option 1: Using UV (Recommended)

UV is a modern, high-performance Python package manager offering fast dependency resolution and installation.

git clone https://github.com/zerolovesea/NextRec.git
cd NextRec

# Install UV if not already installed
pip install uv

# Create virtual environment and install dependencies
uv sync

# Activate the virtual environment
source .venv/bin/activate  # macOS/Linux
# or
.venv\Scripts\activate     # Windows

# Install the package in editable mode
uv pip install -e .

Note: Make sure to deactivate any other conda/virtual environments before running uv sync to avoid environment conflicts.

Option 2: Using pip/source installation

git clone https://github.com/zerolovesea/NextRec.git
cd NextRec

# Install dependencies
pip install -r requirements.txt
pip install -r test_requirements.txt

# Install the package in editable mode
pip install -e .

5-Minute Quick Start

The following example demonstrates a full DeepFM training & inference pipeline using the MovieLens dataset:

import pandas as pd

from nextrec.models.ranking.deepfm import DeepFM
from nextrec.basic.features import DenseFeature, SparseFeature, SequenceFeature

df = pd.read_csv("dataset/movielens_100k.csv")

target = 'label'
dense_features = [DenseFeature('age')]
sparse_features = [
    SparseFeature('user_id', vocab_size=df['user_id'].max()+1, embedding_dim=4),
    SparseFeature('item_id', vocab_size=df['item_id'].max()+1, embedding_dim=4),
]

sparse_features.append(SparseFeature('gender', vocab_size=df['gender'].max()+1, embedding_dim=4))
sparse_features.append(SparseFeature('occupation', vocab_size=df['occupation'].max()+1, embedding_dim=4))

model = DeepFM(
    dense_features=dense_features,
    sparse_features=sparse_features,
    mlp_params={"dims": [256, 128], "activation": "relu", "dropout": 0.5},
    target=target,
    device='cpu',
    model_id="deepfm_with_processor",
    embedding_l1_reg=1e-6,
    dense_l1_reg=1e-5,
    embedding_l2_reg=1e-5,
    dense_l2_reg=1e-4,
)

model.compile(optimizer="adam", optimizer_params={"lr": 1e-3, "weight_decay": 1e-5}, loss="bce")
model.fit(train_data=df, metrics=['auc', 'recall', 'precision'], epochs=10, batch_size=512, shuffle=True, verbose=1)
preds = model.predict(df)
print(f'preds: {preds}')

More Tutorials

The tutorials/ directory provides examples for ranking, retrieval, multi-task learning, and data processing:

  • movielen_match_dssm.py — DSSM retrieval on MovieLens 100k
  • movielen_ranking_deepfm.py — DeepFM ranking on MovieLens 100k
  • example_ranking_din.py — DIN (Deep Interest Network) example
  • example_match_dssm.py — DSSM retrieval example
  • example_multitask.py — Multi-task learning example

Data Processing Example

NextRec offers a unified interface for preprocessing sparse and sequence features:

import pandas as pd
from nextrec.data.preprocessor import DataProcessor

df = pd.read_csv("dataset/movielens_100k.csv")

processor = DataProcessor()
processor.add_sparse_feature('movie_title', encode_method='hash', hash_size=1000)
processor.fit(df)

df = processor.transform(df, return_dict=False)

print("\nSample training data:")
print(df.head())

Supported Models

Ranking Models

Model Paper Year Status
FM Factorization Machines ICDM 2010 Supported
AFM Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks IJCAI 2017 Supported
DeepFM DeepFM: A Factorization-Machine based Neural Network for CTR Prediction IJCAI 2017 Supported
Wide&Deep Wide & Deep Learning for Recommender Systems DLRS 2016 Supported
xDeepFM xDeepFM: Combining Explicit and Implicit Feature Interactions KDD 2018 Supported
FiBiNET FiBiNET: Combining Feature Importance and Bilinear Feature Interaction for CTR Prediction RecSys 2019 Supported
PNN Product-based Neural Networks for User Response Prediction ICDM 2016 Supported
AutoInt AutoInt: Automatic Feature Interaction Learning CIKM 2019 Supported
DCN Deep & Cross Network for Ad Click Predictions ADKDD 2017 Supported
DIN Deep Interest Network for CTR Prediction KDD 2018 Supported
DIEN Deep Interest Evolution Network AAAI 2019 Supported
MaskNet MaskNet: Feature-wise Gating Blocks for High-dimensional Sparse Recommendation Data 2020 Supported

Retrieval Models

Model Paper Year Status
DSSM Learning Deep Structured Semantic Models CIKM 2013 Supported
DSSM v2 DSSM with pairwise BPR-style optimization - Supported
YouTube DNN Deep Neural Networks for YouTube Recommendations RecSys 2016 Supported
MIND Multi-Interest Network with Dynamic Routing CIKM 2019 Supported
SDM Sequential Deep Matching Model - Supported

Multi-task Models

Model Paper Year Status
MMOE Modeling Task Relationships in Multi-task Learning KDD 2018 Supported
PLE Progressive Layered Extraction RecSys 2020 Supported
ESMM Entire Space Multi-task Model SIGIR 2018 Supported
ShareBottom Multitask Learning - Supported

Generative Models

Model Paper Year Status
TIGER Recommender Systems with Generative Retrieval NeurIPS 2023 In Progress
HSTU Hierarchical Sequential Transduction Units - In Progress

Contributing

We welcome contributions of any form!

How to Contribute

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push your branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Before submitting a PR, please run tests using pytest test/ -v or python -m pytest to ensure everything passes.

Code Style

  • Follow PEP8
  • Provide unit tests for new functionality
  • Update documentation accordingly

Reporting Issues

When submitting issues on GitHub, please include:

  • Description of the problem
  • Reproduction steps
  • Expected behavior
  • Actual behavior
  • Environment info (Python version, PyTorch version, etc.)

License

This project is licensed under the Apache 2.0 License.


Contact


Acknowledgements

NextRec is inspired by the following great open-source projects:

  • torch-rechub - A Lighting Pytorch Framework for Recommendation Models, Easy-to-use and Easy-to-extend.
  • FuxiCTR — Configurable and reproducible CTR prediction library
  • RecBole — Unified and efficient recommendation library
  • PaddleRec — Large-scale recommendation algorithm library

Special thanks to all open-source contributors!


Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nextrec-0.1.1.tar.gz (24.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nextrec-0.1.1-py3-none-any.whl (95.3 kB view details)

Uploaded Python 3

File details

Details for the file nextrec-0.1.1.tar.gz.

File metadata

  • Download URL: nextrec-0.1.1.tar.gz
  • Upload date:
  • Size: 24.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for nextrec-0.1.1.tar.gz
Algorithm Hash digest
SHA256 53d19d32eae23d4d1a6507d04294eb696ba858753779a46bc9d3c01fddaa0286
MD5 49c2a00e13f6b955baadce37065d02bc
BLAKE2b-256 5dc9594151fefb68924433d92656ceed2fb222203a8bdb1be3718e3a84386196

See more details on using hashes here.

File details

Details for the file nextrec-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: nextrec-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 95.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for nextrec-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 15d498501a3bf7793ec60faff303c5c524bdfe2e35736a57d5899cdbf5213a3e
MD5 a906899f5ed2d50e187c9ba989975379
BLAKE2b-256 c789553dc20a1589fb260cfcc49767d2da9cb8a2f86dcd377cd85e6a0ae6522a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page