[HTML][HTML] Review of large vision models and visual prompt engineering

J Wang, Z Liu, L Zhao, Z Wu, C Ma, S Yu, H Dai… - Meta-Radiology, 2023 - Elsevier
Visual prompt engineering is a fundamental methodology in the field of visual and image
artificial general intelligence. As the development of large vision models progresses, the …

End-edge-cloud collaborative computing for deep learning: A comprehensive survey

Y Wang, C Yang, S Lan, L Zhu… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org
The booming development of deep learning applications and services heavily relies on
large deep learning models and massive data in the cloud. However, cloud-based deep …

Scconv: Spatial and channel reconstruction convolution for feature redundancy

J Li, Y Wen, L He - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
Abstract Convolutional Neural Networks (CNNs) have achieved remarkable performance in
various computer vision tasks but this comes at the cost of tremendous computational …

Efficientsam: Leveraged masked image pretraining for efficient segment anything

Y **ong, B Varadarajan, L Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Segment Anything Model (SAM) has emerged as a powerful tool for numerous
vision applications. A key component that drives the impressive performance for zero-shot …

Logit standardization in knowledge distillation

S Sun, W Ren, J Li, R Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Knowledge distillation involves transferring soft labels from a teacher to a student
using a shared temperature-based softmax function. However the assumption of a shared …

Effective whole-body pose estimation with two-stages distillation

Z Yang, A Zeng, C Yuan, Y Li - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an
image. This task is challenging due to multi-scale body parts, fine-grained localization for …

Decoupled multimodal distilling for emotion recognition

Y Li, Y Wang, Z Cui - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
Human multimodal emotion recognition (MER) aims to perceive human emotions via
language, visual and acoustic modalities. Despite the impressive performance of previous …

Curriculum temperature for knowledge distillation

Z Li, X Li, L Yang, B Zhao, R Song, L Luo, J Li… - Proceedings of the …, 2023 - ojs.aaai.org
Most existing distillation methods ignore the flexible role of the temperature in the loss
function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In …

Densely knowledge-aware network for multivariate time series classification

Z **ao, H **ng, R Qu, L Feng, S Luo… - … on Systems, Man …, 2024 - ieeexplore.ieee.org
Multivariate time series classification (MTSC) based on deep learning (DL) has attracted
increasingly more research attention. The performance of a DL-based MTSC algorithm is …

Multi-level logit distillation

Y **, J Wang, D Lin - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Abstract Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher
model to a lightweight student model. Mainstream KD methods can be divided into two …