Tased-net: Temporally-aggregating spatial encoder-decoder network for video saliency detection
TASED-Net is a 3D fully-convolutional network architecture for video saliency detection. It
consists of two building blocks: first, the encoder network extracts low-resolution …
consists of two building blocks: first, the encoder network extracts low-resolution …
Video saliency forecasting transformer
Video saliency prediction (VSP) aims to imitate eye fixations of humans. However, the
potential of this task has not been fully exploited since existing VSP methods only focus on …
potential of this task has not been fully exploited since existing VSP methods only focus on …
Hierarchical domain-adapted feature learning for video saliency prediction
In this work, we propose a 3D fully convolutional architecture for video saliency prediction
that employs hierarchical supervision on intermediate maps (referred to as conspicuity …
that employs hierarchical supervision on intermediate maps (referred to as conspicuity …
Temporal-spatial feature pyramid for video saliency detection
Q Chang, S Zhu - ar** with the audio …