A codebase integrated Image manipulation detection & localization, Deepfake detection, Document manipulation detection and AIGC detection.
Project description
[NeurlPS 2025] ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
Bo Du†, Xuekang Zhu†, Xiaochen Ma†, Chenfan Qu†, Kaiwen Feng†, Zhe Yang
Chi-Man Pun, Jian Liu*, Jizhe Zhou*
†: joint first author & equal contribution *: corresponding author
🙋♂️Welcome to ForensicHub!
ForensicHub is the go-to benchmark and modular codebase for all-domain fake image detection and localization, covering deepfake detection (Deepfake), image manipulation detection and localization (IMDL), artificial intelligence-generated image detection (AIGC), and document image manipulation localization (Doc). Whether you're benchmarking forensic models or building your own cross-domain pipelines, ForensicHub offers a flexible, configuration-driven architecture to streamline development, comparison, and analysis.
🏆 FIDL Leaderboard 🏆
We make the FIDL leaderboard for unified ranking model's generalization across all domains. See here for more details.
| 🏆 Rank | Model | Deepfake 🖼️ | IMDL 📝 | AIGC 🤖 | Doc 📄 | Avg ⭐ |
|---|---|---|---|---|---|---|
| 🥇 1 | Effort | 0.614 | 0.587 | 0.410 | 0.788 | 0.600 |
| 🥈 2 | Segformer-b3 | 0.629 | 0.576 | 0.339 | 0.724 | 0.567 |
| 🥉 3 | Clip-ViT-L/14 | 0.664 | 0.543 | 0.317 | 0.724 | 0.562 |
| 4 | ConvNeXT | 0.662 | 0.573 | 0.337 | 0.669 | 0.560 |
| 5 | Mesorch | 0.541 | 0.562 | 0.460 | 0.591 | 0.538 |
| 6 | UnivFD | 0.442 | 0.486 | 0.463 | 0.734 | 0.531 |
| 7 | IML-ViT | 0.581 | 0.562 | 0.325 | 0.626 | 0.523 |
| ... |
🚤Update
- [2025.7.17] Released some missing pretrain weights for DocTamper Detection models, see this issue for details.
- [2025.7.11] We update to a lazy-load version of MODEL and POSTFUNC. The package will be checked when the model is actually used, which reduces unnecessary package installation.
- [2025.7.10] Add a script for single image inference, see Code.
- [2025.7.6] Add a new AIGC model, FatFormer, see Code.
- [2025.7.1] Add document of Data Preparation & JSON Generation and Running Training & Evaluation in ForensicHub, see Data Preparation and Running Evaluation.
- [2025.6.22] Add summary of models and evaluators in ForensicHub, see Document.
- [2025.6.16] Add detailed installation and YAML configuration, see Document.
- [2025.6.14] Add four new backbones: UNet, ViT, MobileNet, and DenseNet. More backbones are ongoing!
👨💻 About
☑️About the Developers:
- ForensicHub's project leader/supervisor is Associate Professor 🏀Jizhe Zhou (周吉喆), Sichuan University🇨🇳, and Jian Liu (刘健), the Leader of the Computer Vision Algorithm Research Group, Ant Group Company Ltd.
- ForensicHub's codebase designer and coding leader is Bo Du (杜博), Sichuan University🇨🇳.
- ForensicHub is jointly sponsored and advised by Prof. Jiancheng LV (吕建成), Sichuan University 🐼, and Prof. Chi-Man PUN (潘治文), University of Macau 🇲🇴, through the Research Center of Machine Learning and Industry Intelligence, China MOE platform.
📦 Resources
You can find the resources of models under IFF-Protocol, including checkpoints (or onedrive), training parameters, and hardware specifications.
Checkpoints for Document Benchmark: https://pan.baidu.com/s/13ViyJebu12I0GN3BucBQrg?pwd=npkx or https://drive.google.com/drive/folders/1RZZxwYIX5e-lHKDw1CD45FwFC0QqJ7im?usp=sharing
Checkpoints for AIGC Benchmark: https://pan.baidu.com/s/11Jr2wjp6lAz9IBNWnbHlVg?pwd=kzhf or https://drive.google.com/drive/folders/1M-qe5xOblVZgKiBQ9j1Q-GQ4ao5VJMHZ?usp=sharing
Pretrained backbone weights for Document models: https://pan.baidu.com/s/1lsArVWzcJiADUcYYeqyClw?pwd=4gf4 or https://drive.google.com/drive/folders/1NiHeRAcG2VkoN-JFgV5O_4YynQFiQWUw?usp=sharing. Place the checkpoint under the corresponding model’s folder.
🕵️♂️ Architecture
ForensicHub provides four core modular components:
🗂️ Datasets
Datasets handle the data loading process and are required to return fields that conform to the ForensicHub specification.
🔧 Transforms
Transforms handle the data pre-processing and augmentation for different tasks.
🧠 Models
Models, through alignment with Datasets and unified output, allow for the inclusion of various state-of-the-art image forensic models.
📊 Evaluators
Evaluators cover commonly used image- and pixel-level metrics for different tasks, and are implemented with GPU acceleration to improve evaluation efficiency during training and testing.
📁 Project Structure Overview
ForensicHub/
├── common/ # Common modules
│ ├── backbones/ # Backbones and feature extractors
│ ├── evalaution/ # Image- and pixel-level evaluators
│ ├── utils/ # Utilities
│ └── wrapper/ # Wrappers for dataset, model, etc.
├── core/ # Core module providing abstract base classes
├── statics/ # YAML configuration files for training and testing
├── tasks/ # Components for different sub-tasks
│ ├── aigc/
│ ├── deepfake/
│ ├── document/
│ └── imdl/
└── training_scripts # Scripts for training and evaluation
📀Installation
We recommend cloning the project locally.
📉Clone
Simply run the following command:
git clone https://github.com/scu-zjz/ForensicHub.git
Also, since ForensicHub is compatible with DeepfakeBench (which hasn’t been uploaded to PyPI), you’ll need to clone our forked version Site locally and install it using: pip install -e ..
🎯Quick Start
The Quick Start example is based on the local clone setup. ForensicHub is a modular and configuration-driven lightweight framework. You only need to use the built-in or custom Dataset, Transform, and Model components, register them, and then launch the pipeline using a YAML configuration file.
Training on the DiffusionForensics dataset using Resnet for AIGC
- Dataset Preparation
Download the DiffusionForensics dataset from (https://github.com/ZhendongWang6/DIRE).
The experiment only uses the ImageNet portion. Format the data as JSON. ForensicHub does not restrict how the data is
loaded—just make sure the Dataset returns fields as defined in \core\base_dataset.py. This means users are free to
implement their own loading logic. In this case, we
use /tasks/aigc/datasets/label_dataset.py, which expects a JSON with entries like with label of 0 and 1 representing a
image of real and generated:
[
{
"path": "/mnt/data3/public_datasets/AIGC/DiffusionForensics/images/train/imagenet/real/n03982430/ILSVRC2012_val_00039791.JPEG",
"label": 0
},
{
"path": "/mnt/data3/public_datasets/AIGC/DiffusionForensics/images/train/imagenet/real/n03982430/ILSVRC2012_val_00022594.JPEG",
"label": 0
},
...
]
- Component Preparation
In this example, the Model is ResNet50, which is already registered in /common/backbones/resnet.py, so no extra
code is needed. Transform is also pre-registered and available in /tasks/aigc/transforms/aigc_transforms.py,
providing basic
augmentations and ImageNet-standard normalization.
- YAML Config & Training
ForensicHub supports lightweight configuration via YAML files. In this example, aside from data preparation, no
additional code is required.
Here is a sample training YAML /statics/aigc/resnet_train.yaml. The four components-Model, Dataset, Transform,
Evaluator-are all initiated
via init_config:
# DDP
gpus: "4,5"
flag: train
# Log
log_dir: "./log/aigc_resnet_df_train"
# Task
if_predict_label: true
if_predict_mask: false
# Model
model:
name: Resnet50
# Model specific setting
init_config:
pretrained: true
num_classes: 1
# Train dataset
train_dataset:
name: AIGCLabelDataset
dataset_name: DiffusionForensics_train
init_config:
image_size: 224
path: /mnt/data1/public_datasets/AIGC/DiffusionForensics/images/train.json
# Test dataset (one or many)
test_dataset:
- name: AIGCLabelDataset
dataset_name: DiffusionForensics_val
init_config:
image_size: 224
path: /mnt/data1/public_datasets/AIGC/DiffusionForensics/images/val.json
# Transform
transform:
name: AIGCTransform
# Evaluators
evaluator:
- name: ImageF1
init_config:
threshold: 0.5
# Training related
batch_size: 768
test_batch_size: 128
epochs: 20
accum_iter: 1
record_epoch: 0 # Save the best only after record epoch.
# Test related
no_model_eval: false
test_period: 1
# Logging & TensorBoard
log_per_epoch_count: 20
# DDP & AMP settings
find_unused_parameters: false
use_amp: true
# Optimizer parameters
weight_decay: 0.05
lr: 1e-4
blr: 0.001
min_lr: 1e-5
warmup_epochs: 1
# Device and training control
device: "cuda"
seed: 42
resume: ""
start_epoch: 0
num_workers: 8
pin_mem: true
# Distributed training parameters
world_size: 1
local_rank: -1
dist_on_itp: false
dist_url: "env://"
After creating the YAML file, you can launch training using statics/run.sh after updating file paths. You can also
use statics/batch_run.sh for batch experiments, which internally invokes multiple run.sh scripts. Testing works
similarly and only requires configuring the same four components.
- LLM Config (Optional)
- Qwen3-VL (transformers>=4.57.0, qwen_vl_utils>=0.0.14)
Citation
@misc{du2025forensichubunifiedbenchmark,
title={ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization},
author={Bo Du and Xuekang Zhu and Xiaochen Ma and Chenfan Qu and Kaiwen Feng and Zhe Yang and Chi-Man Pun and Jian Liu and Jizhe Zhou},
year={2025},
eprint={2505.11003},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2505.11003},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forensichub13-0.0.1.tar.gz.
File metadata
- Download URL: forensichub13-0.0.1.tar.gz
- Upload date:
- Size: 240.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76dfb77f78001abd21b2ed03203cf45979bdee447aa2e6dc42552276919d4b05
|
|
| MD5 |
eb13fd04b0ff70ee41c5daac97bdfbdd
|
|
| BLAKE2b-256 |
bbed76b0bf5b8b149188bb60a8c371d507734d0243bb67e6e478ef5506143882
|
Provenance
The following attestation bundles were made for forensichub13-0.0.1.tar.gz:
Publisher:
publish.yml on Kyliroco/ForensicHub
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
forensichub13-0.0.1.tar.gz -
Subject digest:
76dfb77f78001abd21b2ed03203cf45979bdee447aa2e6dc42552276919d4b05 - Sigstore transparency entry: 864076472
- Sigstore integration time:
-
Permalink:
Kyliroco/ForensicHub@bfc2fac3590fed87f297b80cda4c7bd977f42281 -
Branch / Tag:
refs/tags/0.2.0 - Owner: https://github.com/Kyliroco
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bfc2fac3590fed87f297b80cda4c7bd977f42281 -
Trigger Event:
release
-
Statement type:
File details
Details for the file forensichub13-0.0.1-py3-none-any.whl.
File metadata
- Download URL: forensichub13-0.0.1-py3-none-any.whl
- Upload date:
- Size: 318.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acabc3095b295aa3a80da49549a9fb2d5a3706f90af7f4aad73979b4329c9356
|
|
| MD5 |
a1d3b84c81713c15a9be28b7e33a3782
|
|
| BLAKE2b-256 |
eae151f60a4e57da61294b315d10a9d346aca5e81fd1d890c642ce3231057c57
|
Provenance
The following attestation bundles were made for forensichub13-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on Kyliroco/ForensicHub
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
forensichub13-0.0.1-py3-none-any.whl -
Subject digest:
acabc3095b295aa3a80da49549a9fb2d5a3706f90af7f4aad73979b4329c9356 - Sigstore transparency entry: 864076484
- Sigstore integration time:
-
Permalink:
Kyliroco/ForensicHub@bfc2fac3590fed87f297b80cda4c7bd977f42281 -
Branch / Tag:
refs/tags/0.2.0 - Owner: https://github.com/Kyliroco
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bfc2fac3590fed87f297b80cda4c7bd977f42281 -
Trigger Event:
release
-
Statement type: