Distributed intelligence for IoT-based smart cities: a survey

IA Hashem, A Siddiqa, FA Alaba, M Bilal… - Neural Computing and …, 2024 - Springer
The remarkable miniaturization of Internet of Things (IoT)-based systems and the rise of
distributed intelligence are promising research paradigms in the design of smart cities. IoT …

Binding touch to everything: Learning unified multimodal tactile representations

F Yang, C Feng, Z Chen, H Park… - Proceedings of the …, 2024 - openaccess.thecvf.com
The ability to associate touch with other modalities has huge implications for humans and
computational systems. However multimodal learning with touch remains challenging due to …

Generating visual scenes from touch

F Yang, J Zhang, A Owens - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
An emerging line of work has sought to generate plausible imagery from touch. Existing
approaches, however, tackle only narrow aspects of the visuo-tactile synthesis problem, and …

Deepfake generation and detection: A benchmark and survey

G Pei, J Zhang, M Hu, Z Zhang, C Wang, Y Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Deepfake is a technology dedicated to creating highly realistic facial images and videos
under specific conditions, which has significant application potential in fields such as …

AV-Deepfake1M: A large-scale LLM-driven audio-visual deepfake dataset

Z Cai, S Ghosh, AP Adatia, M Hayat, A Dhall… - Proceedings of the …, 2024 - dl.acm.org
The detection and localization of highly realistic deepfake audio-visual content are
challenging even for the most advanced state-of-the-art methods. While most of the research …

Learning natural consistency representation for face forgery video detection

D Zhang, Z **ao, S Li, F Lin, J Li, S Ge - European Conference on …, 2024 - Springer
Face Forgery videos have elicited critical social public concerns and various detectors have
been proposed. However, fully-supervised detectors may lead to easily overfitting to specific …

Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges

H Liz-Lopez, M Keita, A Taleb-Ahmed, A Hadid… - Information …, 2024 - Elsevier
Generative deep learning techniques have invaded the public discourse recently. Despite
the advantages, the applications to disinformation are concerning as the counter-measures …

Self-supervised audio-visual soundscape stylization

T Li, R Wang, PY Huang, A Owens… - … on Computer Vision, 2024 - Springer
Speech sounds convey a great deal of information about the scenes, resulting in a variety of
effects ranging from reverberation to additional ambient sounds. In this paper, we …

Pvass-mdd: predictive visual-audio alignment self-supervision for multimodal deepfake detection

Y Yu, X Liu, R Ni, S Yang, Y Zhao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Deepfake techniques can forge the visual or audio signals in the video, which leads to
inconsistencies between visual and audio (VA) signals. Therefore, multimodal detection …

Neurobind: Towards unified multimodal representations for neural signals

F Yang, C Feng, D Wang, T Wang, Z Zeng, Z Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding neural activity and information representation is crucial for advancing
knowledge of brain function and cognition. Neural activity, measured through techniques …