Attacks and defenses in user authentication systems: A survey
X Wang, Z Yan, R Zhang, P Zhang - Journal of Network and Computer …, 2021 - Elsevier
User authentication systems (in short authentication systems) have wide utilization in our
daily life. Unfortunately, existing authentication systems are prone to various attacks while …
daily life. Unfortunately, existing authentication systems are prone to various attacks while …
Survey on automatic lip-reading in the era of deep learning
In the last few years, there has been an increasing interest in develo** systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
Balanced multimodal learning via on-the-fly gradient modulation
Audio-visual learning helps to comprehensively understand the world, by integrating
different senses. Accordingly, multiple input modalities are expected to boost model …
different senses. Accordingly, multiple input modalities are expected to boost model …
Sub-word level lip reading with visual attention
The goal of this paper is to learn strong lip reading models that can recognise speech in
silent videos. Most prior works deal with the open-set visual speech recognition problem by …
silent videos. Most prior works deal with the open-set visual speech recognition problem by …
Combining residual networks with LSTMs for lipreading
We propose an end-to-end deep learning architecture for word-level visual speech
recognition. The system is a combination of spatiotemporal convolutional, residual and …
recognition. The system is a combination of spatiotemporal convolutional, residual and …
[PDF][PDF] Multimodal deep learning.
Deep networks have been successfully applied to unsupervised feature learning for single
modalities (eg, text, images or audio). In this work, we propose a novel application of deep …
modalities (eg, text, images or audio). In this work, we propose a novel application of deep …
Partial multi-view clustering
Real data are often with multiple modalities or comingfrom multiple channels, while multi-
view clusteringprovides a natural formulation for generating clustersfrom such data …
view clusteringprovides a natural formulation for generating clustersfrom such data …
Multimodal sparse transformer network for audio-visual speech recognition
Automatic speech recognition (ASR) is the major human–machine interface in many
intelligent systems, such as intelligent homes, autonomous driving, and servant robots …
intelligent systems, such as intelligent homes, autonomous driving, and servant robots …
Large-scale visual speech recognition
This work presents a scalable solution to open-vocabulary visual speech recognition. To
achieve this, we constructed the largest existing visual speech recognition dataset …
achieve this, we constructed the largest existing visual speech recognition dataset …
Multimodal human–computer interaction: A survey
In this paper, we review the major approaches to multimodal human–computer interaction,
giving an overview of the field from a computer vision perspective. In particular, we focus on …
giving an overview of the field from a computer vision perspective. In particular, we focus on …