Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors

A Firc, K Malinka, P Hanáček - Heliyon, 2023 - cell.com
Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …

Emo: Emote portrait alive generating expressive portrait videos with audio2video diffusion model under weak conditions

L Tian, Q Wang, B Zhang, L Bo - European Conference on Computer …, 2024 - Springer
In this work, we tackle the challenge of enhancing the realism and expressiveness in talking
head video generation by focusing on the dynamic and nuanced relationship between audio …

A review on deepfake generation and detection: bibliometric analysis

A Kaushal, S Kumar, R Kumar - Multimedia Tools and Applications, 2024 - Springer
Deepfake refers to an artificial intelligence-based technique to produce manipulated videos
that look realistic. However, this good aspect of Deepfake sometimes pose serious threats to …

Marlin: Masked autoencoder for facial video representation learning

Z Cai, S Ghosh, K Stefanov, A Dhall… - Proceedings of the …, 2023 - openaccess.thecvf.com
This paper proposes a self-supervised approach to learn universal facial representations
from videos, that can transfer across a variety of facial analysis tasks such as Facial Attribute …

Embodied understanding of driving scenarios

Y Zhou, L Huang, Q Bu, J Zeng, T Li, H Qiu… - … on Computer Vision, 2024 - Springer
Embodied scene understanding serves as the cornerstone for autonomous agents to
perceive, interpret, and respond to open driving scenarios. Such understanding is typically …

Non-contrastive unsupervised learning of physiological signals from video

J Speth, N Vance, P Flynn… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Subtle periodic signals such as blood volume pulse and respiration can be extracted from
RGB video, enabling noncontact health monitoring at low cost. Advancements in remote …

Celebv-text: A large-scale facial text-video dataset

J Yu, H Zhu, L Jiang, CC Loy… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-driven generation models are flourishing in video generation and editing. However,
face-centric text-to-video generation remains a challenge due to the lack of a suitable …

Facechain-imagineid: Freely crafting high-fidelity diverse talking faces from disentangled audio

C Xu, Y Liu, J **ng, W Wang, M Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we abstract the process of people hearing speech extracting meaningful cues
and creating various dynamically audio-consistent talking faces termed Listening and …

Parameter-efficient orthogonal finetuning via butterfly factorization

W Liu, Z Qiu, Y Feng, Y **u, Y Xue, L Yu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large foundation models are becoming ubiquitous, but training them from scratch is
prohibitively expensive. Thus, efficiently adapting these powerful models to downstream …

Mostgan-v: Video generation with temporal motion styles

X Shen, X Li, M Elhoseiny - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Video generation remains a challenging task due to spatiotemporal complexity and the
requirement of synthesizing diverse motions with temporal consistency. Previous works …