Two-pronged technique identifies manipulated expressions and identity swaps

When it comes to deepfake videos, computer scientists can now detect manipulated facial expressions with greater accuracy than ever before. The achievement heralds a new era in the development of automated tools designed to detect manipulated videos.

Significance of the work

In experiments on two challenging data sets, the new recognition technology accurately pin-pointed 99% of manipulated expressions in video clips.

Over the past few years, developments in the deepfake world have made it relatively easy to swap one talking head for another; or to swap one person’s genuine facial expressions for fake facial expressions. But, up until this point, a limited number of methods existed for detecting the latter. For this reason, this new technological development by University of California Riverside researchers is considered notable.

Identity swaps vs. facial expressions

Prior to this point, researchers had created tools capable of detecting deepfake identity swaps with relative accuracy. For example, tools could generally determine the authenticity of a video that featured your organization’s chief executive (yes, that is the real chief executive talking or no, that is not the real chief executive talking). But tools had a tougher time discerning whether a genuine video of your organization’s chief executive (or whomever) had been manipulated to show inaccurate facial expressions.

While the detection of inaccurate facial expressions may seem trivial, consider the power of facial expressions in person-to-person communications. Facial expressions communicate emotions, intentions, and even action requests. Smiles vs. scowls can suggest entirely different preferred sets of business deliverables or desired outcomes.

UC Riverside method

How does the UC Riverside method of deepfake detection work? It divides the task into two components within a deep neural network. The first component observes facial expressions and sends information about the regions that contain the expression -such as the eyes, nose or mouth- into a second component of the system; known as the encoder-decoder. The encoder-decoder architecture maintains responsibility for manipulation detection and localization.

The aforementioned framework, known as Expression Manipulation Detection (EMD), can both detect and localize the specific areas within an image that has been manipulated. In other words, it can create ‘heat maps’ of specific areas of the face that was subjected to video manipulation.

Further details

Experimental analyses reveal that the Expression Manipulation Detection methodology averages better performance than other tools in the detection of facial expression manipulations and in detection of identity swaps. According to UC Riverside researchers, EMD accurately detected 99% of manipulated videos, indicating a significant breakthrough in the detection of manipulated content.

Real-world applications

The detection of genuine or falsified emotional expressions is useful in a variety of disciplines, including image processing, cyber security, robotics, psychological studies and virtual reality development.

Learn more about deepfake detection

For more about the researchers’ work, see the paper entitled, “Detection and Localization of Facial Expression Manipulations,” which was presented at the 2022 Winter Conference on Applications of Computer Vision. Or learn more about deepfakes in’s interview with the CEO of Cyabra, Dan Brahmy.

Lastly, to receive more cutting-edge cyber security news, best practices and analyses, please sign up for the newsletter. 

Deepfake Frequently Asked Questions (FAQs):

What are deepfakes and why are they dangerous?

Deepfakes leverage digital technologies to artificially manipulate representations of genuine persons. In turn, this can create false narratives and spread misinformation, which can lead to fraud, defamation, violence or other negative outcomes. Deepfakes are considered exceptionally dangerous due to the fact that they can take cyber criminality to the next level.

Can deepfake technology be used for good purposes?

There are positive use-cases for deepfake technology development. Not all use-cases involve nefarious intent. For example, synthetic media can improve communication capabilities among the differently enabled, it can assist educators in delivering innovative lessons, and deepfakes can be used for compelling storytelling purposes within the arts sector.

What is Expression Manipulation Detection? (EDM)

Expression manipulation detection discerns facial expressions while providing information about the areas of the visage that contain the expression. Regions include the mouth, eyes, nose, forehead, chin and more.

Can facial recognition tools detect deepfakes?

Facial recognition tools are not particularly sophisticated when it comes to the detection of deepfakes. Commercial facial recognition APIs can be easily decieved by deepfake content. Nonetheless, advances in technologies suggest that facial recognition tools will be able to identify deepfakes more effectively in the future.

Who invented deepfakes?

The invention of deepfake technology is attributed to a suite of 1990’s academic researchers and later, to amateurs in online communities. Industry has only pursued deepfake development in recent years.