Skip to main content

Evaluate generative event sequence models on the long horizon prediction task.

Project description

HoTPP: A Long-Horizon Event Sequence Prediction Benchmark

PyPI version Build Status Downloads License

Installation | Usage | Results | Extension | Paper HoTPP | Paper DeTPP | Citing

The HoTPP benchmark focuses on the long-horizon prediction of event sequences. Each event is characterized by its timestamp, label, and possible additional structured data. Sequences of this type are also known as Marked Temporal Point Processes (Marked TPP, MTPP).

🎉 The HoTPP paper has been accepted to Neurocomputing (Q1) 2026.

🎉 DEF (aka DeTPP) has been presented at AAAI-26 (main track, oral).

Features

  • Next event prediction
  • Long-horizon prediction
  • Downstream classification
  • Working with heterogeneous input and output fields (general event sequences)
  • Distributed training
  • RNN and Transformer models (including HuggingFace causal models)
  • Discrete and continuous-time models
  • Improved TPP thinning algorithm
  • Optimized training and inference (ODE, cont. time models, multi-point generation with RNN)

Implemented Methods

The list of implemented papers:

Year Name Paper Source
2026 DEF (aka DeTPP) Detecting the Future: All-at-Once Event Sequence Forecasting with Horizon Matching AAAI 2026
2026 HoTPP HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting? Neurocomputing
2025 Diffusion Non-autoregressive diffusion-based temporal point processes for continuous-time long-term event prediction Expert Systems with Applications
2022 AttNHP Transformer embeddings of irregularly spaced events and their participants ICLR 2022
2022 HYPRO Hypro: A hybridly normalized probabilistic model for long-horizon prediction of event sequences NeurIPS 2022
2020 IFTPP Intensity-free learning of temporal point processes ICLR 2020
2019 ODE Latent ordinary differential equations for irregularly-sampled time series NeurIPS 2019
2017 NHP The neural hawkes process: A neurally self-modulating multivariate point process NeurIPS 2017
2016 RMTPP Recurrent marked temporal point processes: Embedding event history to vector SIGKDD 2016

Other methods:

  • Simple baselines (MostPopular, Last-K)
  • Next-K extensions of IFTPP and RMTPP.
  • Transformer variants of RMTPP and IFTPP.

Installation

Install via PyPI:

pip install hotpp-benchmark

To install downstream evaluation tools:

pip install 'hotpp-benchmark[downstream]'

Sometimes the following parameters are necessary for successful dependency installation:

CXX=<c++-compiler> CC=<gcc-compiler> pip install hotpp-benchmark

Training and evaluation

The code is divided into the core library and dataset-specific scripts and configuration files.

The dataset-specific part is located in the experiments folder. Each subfolder includes data preparation scripts, model configuration files, and a README file. Data files and logs are usually stored in the same directory. All scripts must be executed from the directory of the specific dataset. Refer to the individual README files for more details.

To train the model, use the following command:

python3 -m hotpp.train --config-dir configs --config-name <model>

To evaluate a specific checkpoint, use the following command:

python3 -m hotpp.evaluate --config-dir configs --config-name <model>

To run multiseed training and evaluation:

python3 -m hotpp.train_multiseed --config-dir configs --config-name <model>

To run multi-GPU training on 2 GPUs:

mpirun -np 2 python3 -m hotpp.train --config-dir configs --config-name <model> ++trainer.devices=2 ++trainer.strategy=ddp

Evaluation results

All evaluation results can be found in the experiments folder.

See tables.

Library architecture

Accuracy

HoTPP leverages high-level decomposition from PyTorch Lightning.

DataModule. All datasets are converted to a set of Parquet files. Each record in a Parquet file contains three main fields: id, timestamps, and labels. The id field represents the identity associated with a sequence (user, client, etc.). Timestamps are stored as an array of floating point numbers with a dataset-specific unit of measure. Labels is an array of integers representing a sequence of event types. The dataloader generates a PaddedBatch object containing a dictionary of padded sequences.

Module. The Module implements high-level logic specific to each group of methods. For example, there is a module for autoregressive models and another for next-k approaches. The Module incorporates a loss function, metric evaluator, and sequence encoder. The sequence encoder can produce discrete outputs, as in traditional RNNs, or continuous-time outputs, as in the NHP method.

Trainer: The Trainer object should typically not be modified, except through a configuration file. The Trainer uses the Module and DataModule to train the model and evaluate metrics.

Configuration files

HoTPP uses Hydra for configuration. The easiest way to create a new configuration file is to start from one in the experiments folder. The configuration file includes sections for the logger, data module, module, and trainer. There are also some required top-level fields like model_path and report. It is highly recommended to specify a random seed by setting seed_everything.

Hyperparameter tuning

Hyperparameters can be tuned by WandB Sweeps. Example configuration files for sweeps, such as experiments/amazon/configs/sweep_next_item.yaml, can be used as follows:

wandb sweep ./configs/<sweep-configuration-file>

The above command will generate a command for running the agent, e.g.:

wandb agent <sweep-id>

There is a special script in the library to analyze tuning results:

python3 -m hotpp.parse_wandb_hopt ./configs/<sweep-configuration-file> <sweep-id>

Reproducibility

To achieve reproducible results, it is highly recommended to use the provided Dockerfile. However, there may be minor differences depending on the specific GPU model.

The reference evaluation results are stored in the results subfolder within each dataset directory in the experiments folder.

Tests

To run tests, use the following command:

pytest tests

Known issues

If downstream evaluation hangs during LightGBM or CatBoost training, try setting the following environment variable:

OMP_NUM_THREADS=1 python3 -m hotpp.evaluate --config-dir configs --config-name <model>

Citation

If you use HoTPP in your project, please cite the following paper:

@article{karpukhin2024hotppbenchmark,
  title={HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting?},
  author={Karpukhin, Ivan and Shipilov, Foma and Savchenko, Andrey},
  journal={arXiv preprint arXiv:2406.14341},
  year={2024},
  url ={https://arxiv.org/abs/2406.14341}
}

If you incorporate ideas from DeTPP, use it for comparison, or reference it in a review, please cite the following paper:

@article{karpukhin2024detpp,
  title={DeTPP: Leveraging Object Detection for Robust Long-Horizon Event Prediction},
  author={Karpukhin, Ivan and Savchenko, Andrey},
  journal={arXiv preprint arXiv:2408.13131},
  year={2024},
  url ={https://arxiv.org/abs/2408.13131}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hotpp_benchmark-0.6.9-py3-none-any.whl (133.1 kB view details)

Uploaded Python 3

File details

Details for the file hotpp_benchmark-0.6.9-py3-none-any.whl.

File metadata

File hashes

Hashes for hotpp_benchmark-0.6.9-py3-none-any.whl
Algorithm Hash digest
SHA256 8a14bb8c6183c17470c3bd57221961af36f774ccb125c9fbb7dcedfa5e7b6958
MD5 f467086ac263e16cc595063cf013cd7f
BLAKE2b-256 e7d2ce9717ff313944be5e5bbb64126b03ce7b4e0240b9232e2b593307b87176

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page