A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022‏ - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Ghostnet: More features from cheap operations

K Han, Y Wang, Q Tian, J Guo… - Proceedings of the …, 2020‏ - openaccess.thecvf.com
Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the
limited memory and computation resources. The redundancy in feature maps is an important …

Towards unified text-based person retrieval: A large-scale multi-attribute and language search benchmark

S Yang, Y Zhou, Z Zheng, Y Wang, L Zhu… - Proceedings of the 31st …, 2023‏ - dl.acm.org
In this paper, we introduce a large Multi-Attribute and Language Search dataset for text-
based person retrieval, called MALS, and explore the feasibility of performing pre-training on …

A survey on visual transformer

K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arxiv preprint arxiv …, 2020‏ - arxiv.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022‏ - Springer
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

Abd-net: Attentive but diverse person re-identification

T Chen, S Ding, J **e, Y Yuan… - Proceedings of the …, 2019‏ - openaccess.thecvf.com
Attention mechanisms have been found effective for person re-identification (Re-ID).
However, the learned" attentive" features are often not naturally uncorrelated or" diverse" …

Hierarchical deep click feature prediction for fine-grained image recognition

J Yu, M Tan, H Zhang, Y Rui… - IEEE transactions on …, 2019‏ - ieeexplore.ieee.org
The click feature of an image, defined as the user click frequency vector of the image on a
predefined word vocabulary, is known to effectively reduce the semantic gap for fine-grained …

GhostNets on heterogeneous devices via cheap operations

K Han, Y Wang, C Xu, J Guo, C Xu, E Wu… - International Journal of …, 2022‏ - Springer
Deploying convolutional neural networks (CNNs) on mobile devices is difficult due to the
limited memory and computation resources. We aim to design efficient neural networks for …

Beyond human parts: Dual part-aligned representations for person re-identification

J Guo, Y Yuan, L Huang, C Zhang… - Proceedings of the …, 2019‏ - openaccess.thecvf.com
Person re-identification is a challenging task due to various complex factors. Recent studies
have attempted to integrate human parsing results or externally defined attributes to help …

Greedynas: Towards fast one-shot nas with greedy supernet

S You, T Huang, M Yang, F Wang… - Proceedings of the …, 2020‏ - openaccess.thecvf.com
Training a supernet matters for one-shot neural architecture search (NAS) methods since it
serves as a basic performance estimator for different architectures (paths). Current methods …