Distributed intelligence for IoT-based smart cities: a survey
The remarkable miniaturization of Internet of Things (IoT)-based systems and the rise of
distributed intelligence are promising research paradigms in the design of smart cities. IoT …
distributed intelligence are promising research paradigms in the design of smart cities. IoT …
Binding touch to everything: Learning unified multimodal tactile representations
The ability to associate touch with other modalities has huge implications for humans and
computational systems. However multimodal learning with touch remains challenging due to …
computational systems. However multimodal learning with touch remains challenging due to …
Generating visual scenes from touch
An emerging line of work has sought to generate plausible imagery from touch. Existing
approaches, however, tackle only narrow aspects of the visuo-tactile synthesis problem, and …
approaches, however, tackle only narrow aspects of the visuo-tactile synthesis problem, and …
Deepfake generation and detection: A benchmark and survey
Deepfake is a technology dedicated to creating highly realistic facial images and videos
under specific conditions, which has significant application potential in fields such as …
under specific conditions, which has significant application potential in fields such as …
AV-Deepfake1M: A large-scale LLM-driven audio-visual deepfake dataset
The detection and localization of highly realistic deepfake audio-visual content are
challenging even for the most advanced state-of-the-art methods. While most of the research …
challenging even for the most advanced state-of-the-art methods. While most of the research …
Learning natural consistency representation for face forgery video detection
Face Forgery videos have elicited critical social public concerns and various detectors have
been proposed. However, fully-supervised detectors may lead to easily overfitting to specific …
been proposed. However, fully-supervised detectors may lead to easily overfitting to specific …
Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges
Generative deep learning techniques have invaded the public discourse recently. Despite
the advantages, the applications to disinformation are concerning as the counter-measures …
the advantages, the applications to disinformation are concerning as the counter-measures …
Self-supervised audio-visual soundscape stylization
Speech sounds convey a great deal of information about the scenes, resulting in a variety of
effects ranging from reverberation to additional ambient sounds. In this paper, we …
effects ranging from reverberation to additional ambient sounds. In this paper, we …
Pvass-mdd: predictive visual-audio alignment self-supervision for multimodal deepfake detection
Deepfake techniques can forge the visual or audio signals in the video, which leads to
inconsistencies between visual and audio (VA) signals. Therefore, multimodal detection …
inconsistencies between visual and audio (VA) signals. Therefore, multimodal detection …
Neurobind: Towards unified multimodal representations for neural signals
Understanding neural activity and information representation is crucial for advancing
knowledge of brain function and cognition. Neural activity, measured through techniques …
knowledge of brain function and cognition. Neural activity, measured through techniques …