Deepfakes and the New AI-Generated Fake Media Creation- Detection Arms Race

Falsified videos created by AI—in particular, by deep neural networks (DNNs)—are a recent twist to the disconcerting problem of online disinformation. Although fabrication and manipulation of digital images and videos are not new, the rapid development of AI technology in recent years has made the process to create convincing fake videos much easier and faster. AI generated fake videos first caught the public’s attention in late 2017, when a Reddit account with the name Deepfakes posted pornographic videos generated with a DNN-based face-swapping algorithm. Subsequently, the term deepfake has been used more broadly to refer to all types of AI-generated impersonating videos.

While there are interesting and creative applications of deepfakes, they are also likely to be weaponized. We were among the early responders to this phenomenon, and developed the first deepfake detection method based on the lack of realistic eye-blinking in the early generations of deepfake videos in early 2018. Subsequently, there is a surge of interest in developing deepfake detection methods.

A climax of these efforts is this year’s Deepfake Detection Challenge . Overall, the winning solutions are a tour de force of advanced DNNs (an average precision of 82.56 percent by the top performer). These provide us effective tools to expose deepfakes that are automated and mass-produced by AI algorithms. However, we need to be cautious in reading these results. Although the organizers have made their best effort to simulate situations where deepfake videos are deployed in real life, there is still a significant discrepancy between the performance on the evaluation data set and a more real data set; when tested on unseen videos, the top performer’s accuracy reduced to 65.18 percent.

In addition, all solutions are based on clever designs of DNNs and data augmentations, but provide little insight beyond the “black box”–type classification algorithms. Furthermore, these detection results do not reflect the actual detection performance of the algorithm on a single deepfake video, especially ones that have been manually processed and perfected after being generated from the AI algorithms. Such “crafted’’ deepfake videos are more likely to cause real damage, and careful manual post processing can reduce or remove artifacts that the detection algorithms are predicated on.

TKMICTP