Skip to main content

No project description provided

Project description

AV-Deepfake1M

This is the official repository for the paper AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset.

Abstract

The detection and localization of highly realistic deepfake audio-visual content are challenging even for the most advanced state-of-the-art methods. While most of the research efforts in this domain are focused on detecting high-quality deepfake images and videos, only a few works address the problem of the localization of small segments of audio-visual manipulations embedded in real videos. In this research, we emulate the process of such content generation and propose the AV-Deepfake1M dataset. The dataset contains content-driven (i) video manipulations, (ii) audio manipulations, and (iii) audio-visual manipulations for more than 2K subjects resulting in a total of more than 1M videos. The paper provides a thorough description of the proposed data generation pipeline accompanied by a rigorous analysis of the quality of the generated data. The comprehensive benchmark of the proposed dataset utilizing state-of-the-art deepfake detection and localization methods indicates a significant drop in performance compared to previous datasets. The proposed dataset will play a vital role in building the next-generation deepfake localization methods.

Dataset

Download

We're hosting 1M-Deepfakes Detection Challenge at ACM MM 2024.

Baseline Benchmark

Method AP@0.5 AP@0.75 AP@0.9 AP@0.95 AR@50 AR@20 AR@10 AR@5
PyAnnote 00.03 00.00 00.00 00.00 00.67 00.67 00.67 00.67
Meso4 09.86 06.05 02.22 00.59 38.92 38.81 36.47 26.91
MesoInception4 08.50 05.16 01.89 00.50 39.27 39.00 35.78 24.59
EfficientViT 14.71 02.42 00.13 00.01 27.04 26.43 23.90 20.31
TriDet + VideoMAEv2 21.67 05.83 00.54 00.06 20.27 20.12 19.50 18.18
TriDet + InternVideo 29.66 09.02 00.79 00.09 24.08 23.96 23.50 22.55
ActionFormer + VideoMAEv2 20.24 05.73 00.57 00.07 19.97 19.81 19.11 17.80
ActionFormer + InternVideo 36.08 12.01 01.23 00.16 27.11 27.00 26.60 25.80
BA-TFD 37.37 06.34 00.19 00.02 45.55 35.95 30.66 26.82
BA-TFD+ 44.42 13.64 00.48 00.03 48.86 40.37 34.67 29.88
UMMAFormer 51.64 28.07 07.65 01.58 44.07 43.45 42.09 40.27

Metadata Structure

The metadata is a json file for each subset (train, val), which is a list of dictionaries. The fields in the dictionary are as follows.

  • file: the path to the video file.
  • original: if the current video is fake, the path to the original video; otherwise, the original path in VoxCeleb2.
  • split: the name of the current subset.
  • modify_type: the type of modifications in different modalities, which can be ["real", "visual_modified", "audio_modified", "both_modified"]. We evaluate the deepfake detection performance based on this field.
  • audio_model: the audio generation model used for generating this video.
  • fake_segments: the timestamps of the fake segments. We evaluate the temporal localization performance based on this field.
  • audio_fake_segments: the timestamps of the fake segments in audio modality.
  • visual_fake_segments: the timestamps of the fake segments in visual modality.
  • video_frames: the number of frames in the video.
  • audio_frames: the number of frames in the audio.

SDK

We provide a Python library avdeepfake1m to load the dataset and evaluation.

Installation

pip install avdeepfake1m

Usage

Prepare the dataset as follows.

|- train_metadata.json
|- train_metadata
|  |- ...
|- train
|  |- ...
|- val_metadata.json
|- val_metadata
|  |- ...
|- val
|  |- ...
|- test_files.txt
|- test

Load the dataset.

from avdeepfake1m.loader import AVDeepfake1mDataModule

# access to Lightning DataModule
dm = AVDeepfake1mDataModule("/path/to/dataset")

Evaluate the predictions. Firstly prepare the predictions as described in the details. Then run the following code.

from avdeepfake1m.evaluation import ap_ar_1d, auc
print(ap_ar_1d("<PREDICTION_JSON>", "<METADATA_JSON>", "file", "fake_segments", 1, [0.5, 0.75, 0.9, 0.95], [50, 30, 20, 10, 5], [0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95]))
print(auc("<PREDICTION_TXT>", "<METADATA_JSON>"))

License

The dataset is under the EULA. You need to agree and sign the EULA to access the dataset.

The other parts of this project is under the CC BY-NC 4.0 license. See LICENSE for details.

References

If you find this work useful in your research, please cite it.

@article{cai2023avdeepfake1m,
  title = {AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset},
  action = {Cai, Zhixi and Ghosh, Shreya and Adatia, Aman Pankaj and Hayat, Munawar and Dhall, Abhinav and Stefanov, Kalin},
  journal = {arXiv preprint arXiv:2311.15308},
  year = {2023},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

avdeepfake1m-0.0.0.tar.gz (647.9 kB view hashes)

Uploaded Source

Built Distributions

avdeepfake1m-0.0.0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (642.4 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (623.7 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (484.9 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (536.5 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (534.5 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.9 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.5 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (642.4 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (623.7 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.5 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-pp39-pypy39_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (536.6 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-pp39-pypy39_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (534.2 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-pp39-pypy39_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.8 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.5 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-pp38-pypy38_pp73-musllinux_1_2_x86_64.whl (642.5 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-pp38-pypy38_pp73-musllinux_1_2_aarch64.whl (623.7 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-pp38-pypy38_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (536.7 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-pp38-pypy38_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (534.4 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-pp38-pypy38_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (451.0 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.5 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-pp37-pypy37_pp73-musllinux_1_2_x86_64.whl (644.8 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-pp37-pypy37_pp73-musllinux_1_2_aarch64.whl (626.5 kB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-pp37-pypy37_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (539.4 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-pp37-pypy37_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (537.6 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-pp37-pypy37_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (453.4 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (449.2 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-cp312-none-win_amd64.whl (312.4 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

avdeepfake1m-0.0.0-cp312-cp312-musllinux_1_2_x86_64.whl (642.0 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-cp312-cp312-musllinux_1_2_aarch64.whl (623.8 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.2 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl (526.1 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (534.4 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.0 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.3 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-cp312-cp312-macosx_11_0_arm64.whl (401.5 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

avdeepfake1m-0.0.0-cp312-cp312-macosx_10_12_x86_64.whl (428.9 kB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

avdeepfake1m-0.0.0-cp311-none-win_amd64.whl (312.2 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

avdeepfake1m-0.0.0-cp311-cp311-musllinux_1_2_x86_64.whl (642.3 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-cp311-cp311-musllinux_1_2_aarch64.whl (624.2 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.6 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (535.7 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (535.2 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.7 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.8 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-cp311-cp311-macosx_11_0_arm64.whl (402.3 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

avdeepfake1m-0.0.0-cp311-cp311-macosx_10_12_x86_64.whl (429.8 kB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

avdeepfake1m-0.0.0-cp310-none-win_amd64.whl (312.1 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

avdeepfake1m-0.0.0-cp310-cp310-musllinux_1_2_x86_64.whl (642.4 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-cp310-cp310-musllinux_1_2_aarch64.whl (624.3 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.6 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (535.9 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (535.1 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.8 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.9 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-cp310-cp310-macosx_11_0_arm64.whl (402.4 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

avdeepfake1m-0.0.0-cp39-none-win_amd64.whl (312.2 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

avdeepfake1m-0.0.0-cp39-cp39-musllinux_1_2_x86_64.whl (642.2 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-cp39-cp39-musllinux_1_2_aarch64.whl (624.2 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.9 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (535.6 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (535.1 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.6 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.7 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-cp39-cp39-macosx_11_0_arm64.whl (402.3 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

avdeepfake1m-0.0.0-cp38-none-win_amd64.whl (312.0 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

avdeepfake1m-0.0.0-cp38-cp38-musllinux_1_2_x86_64.whl (642.0 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-cp38-cp38-musllinux_1_2_aarch64.whl (624.1 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.4 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (535.6 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (534.9 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.3 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.7 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

avdeepfake1m-0.0.0-cp37-none-win_amd64.whl (311.9 kB view hashes)

Uploaded CPython 3.7 Windows x86-64

avdeepfake1m-0.0.0-cp37-cp37m-musllinux_1_2_x86_64.whl (642.0 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.2+ x86-64

avdeepfake1m-0.0.0-cp37-cp37m-musllinux_1_2_aarch64.whl (624.0 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.2+ ARM64

avdeepfake1m-0.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (485.5 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

avdeepfake1m-0.0.0-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (535.6 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

avdeepfake1m-0.0.0-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (534.7 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

avdeepfake1m-0.0.0-cp37-cp37m-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (450.3 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARMv7l

avdeepfake1m-0.0.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (446.5 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page