[HTML][HTML] Deep learning in computer vision: A critical review of emerging techniques and application scenarios
Deep learning has been overwhelmingly successful in computer vision (CV), natural
language processing, and video/speech recognition. In this paper, our focus is on CV. We …
language processing, and video/speech recognition. In this paper, our focus is on CV. We …
Artificial intelligence in the creative industries: a review
This paper reviews the current state of the art in artificial intelligence (AI) technologies and
applications in the context of the creative industries. A brief background of AI, and …
applications in the context of the creative industries. A brief background of AI, and …
Seqtrack: Sequence to sequence learning for visual object tracking
In this paper, we present a new sequence-to-sequence learning framework for visual
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …
Universal instance perception as object discovery and retrieval
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …
as category names, language expressions, and target annotations, but this complete field …
Visual prompt multi-modal tracking
Visible-modal object tracking gives rise to a series of downstream multi-modal tracking
tributaries. To inherit the powerful representations of the foundation model, a natural modus …
tributaries. To inherit the powerful representations of the foundation model, a natural modus …
Mixformer: End-to-end tracking with iterative mixed attention
Tracking often uses a multi-stage pipeline of feature extraction, target information
integration, and bounding box estimation. To simplify this pipeline and unify the process of …
integration, and bounding box estimation. To simplify this pipeline and unify the process of …
Joint feature learning and relation modeling for tracking: A one-stream framework
The current popular two-stream, two-stage tracking framework extracts the template and the
search region features separately and then performs relation modeling, thus the extracted …
search region features separately and then performs relation modeling, thus the extracted …
Autoregressive visual tracking
We present ARTrack, an autoregressive framework for visual object tracking. ARTrack
tackles tracking as a coordinate sequence interpretation task that estimates object …
tackles tracking as a coordinate sequence interpretation task that estimates object …
Aiatrack: Attention in attention for transformer visual tracking
Transformer trackers have achieved impressive advancements recently, where the attention
mechanism plays an important role. However, the independent correlation computation in …
mechanism plays an important role. However, the independent correlation computation in …
Generalized relation modeling for transformer tracking
Compared with previous two-stream trackers, the recent one-stream tracking pipeline, which
allows earlier interaction between the template and search region, has achieved a …
allows earlier interaction between the template and search region, has achieved a …