Large-scale multi-modal pre-trained models: A comprehensive survey
With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …
Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods
With extensive pretrained knowledge and high-level general capabilities, large language
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …
Multimodal learning with transformers: A survey
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …
V2x-vit: Vehicle-to-everything cooperative perception with vision transformer
In this paper, we investigate the application of Vehicle-to-Everything (V2X) communication to
improve the perception performance of autonomous vehicles. We present a robust …
improve the perception performance of autonomous vehicles. We present a robust …
R2former: Unified retrieval and reranking transformer for place recognition
Abstract Visual Place Recognition (VPR) estimates the location of query images by matching
them with images in a reference database. Conventional methods generally adopt …
them with images in a reference database. Conventional methods generally adopt …
Semantic segmentation using Vision Transformers: A survey
H Thisanke, C Deshan, K Chamith… - … Applications of Artificial …, 2023 - Elsevier
Semantic segmentation has a broad range of applications in a variety of domains including
land coverage analysis, autonomous driving, and medical image analysis. Convolutional …
land coverage analysis, autonomous driving, and medical image analysis. Convolutional …
A survey of the vision transformers and their CNN-transformer based variants
Vision transformers have become popular as a possible substitute to convolutional neural
networks (CNNs) for a variety of computer vision applications. These transformers, with their …
networks (CNNs) for a variety of computer vision applications. These transformers, with their …
Video transformers: A survey
Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …
them a promising tool for modeling video. However, they lack inductive biases and scale …
CSwin-PNet: A CNN-Swin Transformer combined pyramid network for breast lesion segmentation in ultrasound images
H Yang, D Yang - Expert Systems with Applications, 2023 - Elsevier
Currently, the automatic segmentation of breast tumors based on breast ultrasound (BUS)
images is still a challenging task. Most lesion segmentation methods are implemented …
images is still a challenging task. Most lesion segmentation methods are implemented …
Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …