Fast temporal negative edge sampler for graph neural networks
Project description
Temporal Negative Edge Sampler
The Problem:
Poursafaei et al. in NeurIPS 2022 described that random negative edges make temporal link prediction task easy. This is because random negatives are often between unrelated node pairs that have never interacted, making them trivially easy for models to distinguish from true positive edges. So they introduced the concept of historical negatives—edges that existed in the past but are inactive at the current time—making the prediction task more realistic and challenging.
Huang et al. in NeurIPS 2023 created a benchmark based on this called TGB (Temporal Graph Benchmark) where in the validation and test sets they included both historical and random negatives.
- Historical negatives: Edges that existed in the past but are inactive at the current time between a pair of nodes.
- Random negatives: Edges that never existed in the past between a pair of nodes.
TGB generated 10-20 negative edges per positive edge where 50% is historical and 50% is random.
However, they only released pre-generated pickle files with negatives for their benchmark datasets, and did not publish code to generate such negatives for new datasets or for training-time augmentation.
Negative edges are generated using the following steps:
-
Group all edges by timestamp, sorted in chronological order.
-
Iterate through each timestamp group (i.e., a batch of edges sharing the same timestamp):
- Add the current batch to a streaming graph structure, which keeps track of all previously seen edges (the historical graph).
-
For each source node in the current batch:
- Sample
n × phistorical negative edges:- These are destination nodes that have been connected to the source in the past, but are not connected to the source in the current timestamp.
- Sample
n × (1 - p)random negative edges:- These are destination nodes that have never been connected to the source in the historical graph (including the current batch).
- Sample
While the algorithm described above is correct and benchmark-aligned, it is computationally expensive if implemented naively in Python. A naive Python implementation will not scale to large datasets and may run indefinitely for even moderate-sized graphs.
Temporal Negative Edge Sampler is a high-performance library for generating negative edges in temporal link prediction tasks, supporting both historical and random negative sampling. It mirrors the evaluation strategy used in the Temporal Graph Benchmark (TGB), but extends it to training and custom datasets. The core implementation is written in parallelized C++ with Python bindings, making it scalable to large graphs with millions of edges.
API
collect_all_negatives_by_timestamp
Collects negative edges for temporal graph sampling, processing edges grouped by timestamp.
Signature
from temporal_negative_edge_sampler import collect_all_negatives_by_timestamp
collect_all_negatives_by_timestamp(
sources: np.ndarray,
targets: np.ndarray,
timestamps: np.ndarray,
is_directed: bool,
num_negatives_per_positive: int,
historical_negative_percentage: float = 0.5
) -> Tuple[np.ndarray, np.ndarray]
Parameters
sources(np.ndarray): Array of source node IDstargets(np.ndarray): Array of target node IDstimestamps(np.ndarray): Array of timestamps for each edgeis_directed(bool): Whether the graph is directednum_negatives_per_positive(int): Number of negative edges to sample per positive edgehistorical_negative_percentage(float, optional): Ratio of historical vs random negatives (default: 0.5)
References:
Poursafaei et al. (2022)
Towards Better Evaluation for Dynamic Link Prediction
Farimah Poursafaei, Shenyang Huang, Kris Pelrine, Reihaneh Rabbany
In NeurIPS 2022 Datasets and Benchmarks Track.
https://arxiv.org/pdf/2207.10128
Huang et al. (2023)
Temporal Graph Benchmark for Machine Learning on Temporal Graphs
Shenyang Huang, Farimah Poursafaei, Jacob Danovitch, Matthias Fey, Weihua Hu, Emanuele Rossi, Jure Leskovec, Michael Bronstein, Guillaume Rabusseau, Reihaneh Rabbany
In NeurIPS 2023 Datasets and Benchmarks Track.
https://arxiv.org/pdf/2307.01026
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file temporal_negative_edge_sampler-0.1.14-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: temporal_negative_edge_sampler-0.1.14-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 214.4 kB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8caa77c471e9bf63a817e192f671183e6cfeb3d72cfb111fccafc9f8557b0a5f
|
|
| MD5 |
662bc0149592c5f5bc7108a118a2106a
|
|
| BLAKE2b-256 |
7a49a52f00b2010e27ed1b78da188ceb9f07023cb8bb8ae9bd2b114dfc06feb2
|
File details
Details for the file temporal_negative_edge_sampler-0.1.14-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: temporal_negative_edge_sampler-0.1.14-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 212.9 kB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a32ede770d55ebaed9ea426e6dd9e597cc7ad230e19687f64bce819e7e6cd80
|
|
| MD5 |
3b0a462520a7d87ee7ddb6e42bf250bc
|
|
| BLAKE2b-256 |
85b05f06f3d878fce65c36f675d7f182f55853c316af9957c7dc95e0c5f9049a
|
File details
Details for the file temporal_negative_edge_sampler-0.1.14-cp310-cp310-macosx_15_0_arm64.whl.
File metadata
- Download URL: temporal_negative_edge_sampler-0.1.14-cp310-cp310-macosx_15_0_arm64.whl
- Upload date:
- Size: 73.6 kB
- Tags: CPython 3.10, macOS 15.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76daa58a0664dfb7eb30d40d0dcf49d402135bd851e2273c8046f3aef8d8c637
|
|
| MD5 |
499a29373291a53be6c1a93b3aa17437
|
|
| BLAKE2b-256 |
0d861f479d507e96da4f8cd8b4f259ec99b9013c3ad019b627070e08f1ae1189
|
File details
Details for the file temporal_negative_edge_sampler-0.1.14-cp39-cp39-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: temporal_negative_edge_sampler-0.1.14-cp39-cp39-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 211.9 kB
- Tags: CPython 3.9, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c36bb671419be1376cc8d8085a2d79692cac136dabe1c4f6a008be87e7290f7d
|
|
| MD5 |
5d523ecee2d1d50d8d70f55f798f3fe8
|
|
| BLAKE2b-256 |
526e4c7e85021e863d24f278ee6571a709112a2ca0fbc843b8d9f2dd13842a33
|
File details
Details for the file temporal_negative_edge_sampler-0.1.14-cp38-cp38-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: temporal_negative_edge_sampler-0.1.14-cp38-cp38-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 211.8 kB
- Tags: CPython 3.8, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a76c23486d2308dee461ce60af09bbe538cdff7c810de23dbdd518f88567a02
|
|
| MD5 |
a59d2c20107dc212af203a96ec0f7131
|
|
| BLAKE2b-256 |
68a7878ba11a76a7eb905c78c1c640aa205db6b860c759d0f19b299968dbdd24
|