Predicting visual fixations
As we navigate and behave in the world, we are constantly deciding, a few times per
second, where to look next. The outcomes of these decisions in response to visual input are …
second, where to look next. The outcomes of these decisions in response to visual input are …
A comprehensive survey on video saliency detection with auditory information: the audio-visual consistency perceptual is the key!
Video saliency detection (VSD) aims at fast locating the most attractive
objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied …
objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied …
[HTML][HTML] TranSalNet: Towards perceptually relevant visual saliency prediction
Convolutional neural networks (CNNs) have significantly advanced computational
modelling for saliency prediction. However, accurately simulating the mechanisms of visual …
modelling for saliency prediction. However, accurately simulating the mechanisms of visual …
DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling
Since 2014 transfer learning has become the key driver for the improvement of spatial
saliency prediction-however, with stagnant progress in the last 3-5 years. We conduct a …
saliency prediction-however, with stagnant progress in the last 3-5 years. We conduct a …
Towards end-to-end video-based eye-tracking
Estimating eye-gaze from images alone is a challenging task, in large parts due to un-
observable person-specific factors. Achieving high accuracy typically requires labeled data …
observable person-specific factors. Achieving high accuracy typically requires labeled data …
Video saliency forecasting transformer
Video saliency prediction (VSP) aims to imitate eye fixations of humans. However, the
potential of this task has not been fully exploited since existing VSP methods only focus on …
potential of this task has not been fully exploited since existing VSP methods only focus on …
Vinet: Pushing the limits of visual modality for audio-visual saliency prediction
We propose the ViNet architecture for audio-visual saliency prediction. ViNet is a fully
convolutional encoder-decoder architecture. The encoder uses visual features from a …
convolutional encoder-decoder architecture. The encoder uses visual features from a …
Automatic probe movement guidance for freehand obstetric ultrasound
We present the first system that provides real-time probe movement guidance for acquiring
standard planes in routine freehand obstetric ultrasound scanning. Such a system can …
standard planes in routine freehand obstetric ultrasound scanning. Such a system can …
Transformer-based multi-scale feature integration network for video saliency prediction
Most cutting-edge video saliency prediction models rely on spatiotemporal features
extracted by 3D convolutions due to its local contextual cues acquirement ability. However …
extracted by 3D convolutions due to its local contextual cues acquirement ability. However …
UEyes: Understanding visual saliency across user interface types
While user interfaces (UIs) display elements such as images and text in a grid-based layout,
UI types differ significantly in the number of elements and how they are displayed. For …
UI types differ significantly in the number of elements and how they are displayed. For …