Metasearch: Incremental product search via deep meta-learning
With the advancement of image processing and computer vision technology, content-based
product search is applied in a wide variety of common tasks, such as online shop** …
product search is applied in a wide variety of common tasks, such as online shop** …
Deep convolution feature aggregation: an application to diabetic retinopathy severity level prediction
Diabetic retinopathy (DR) is one of the main causes of loss of vision and blindness in
humans across the world. DR is usually found in patients suffering from diabetes for a long …
humans across the world. DR is usually found in patients suffering from diabetes for a long …
LCM-Captioner: A lightweight text-based image captioning method with collaborative mechanism between vision and text
Text-based image captioning (TextCap) aims to remedy the shortcomings of existing image
captioning tasks that ignore text content when describing images. Instead, it requires models …
captioning tasks that ignore text content when describing images. Instead, it requires models …
[HTML][HTML] Using social media images for building function classification
Urban land use on a building instance level is crucial geo-information for many applications
yet challenging to obtain. Steet-level images are highly suited to predict building functions …
yet challenging to obtain. Steet-level images are highly suited to predict building functions …
Alfpn: adaptive learning feature pyramid network for small object detection
Object detection has become a crucial technology in intelligent vision systems, enabling
automatic detection of target objects. While most detectors perform well on open datasets …
automatic detection of target objects. While most detectors perform well on open datasets …
From Global to Hybrid: A Review of Supervised Deep Learning for 2D Image Feature Representation
X Dong, Q Wang, H Deng, Z Yang… - IEEE Transactions …, 2025 - ieeexplore.ieee.org
Computer vision is the science that aims to enable computers to emulate human visual
perception, and it encompasses various techniques and methods for extracting and …
perception, and it encompasses various techniques and methods for extracting and …
A novel feature representation: Aggregating convolution kernels for image retrieval
Activated hidden units in convolutional neural networks (CNNs), known as feature maps,
dominate image representation, which is compact and discriminative. For ultra-large …
dominate image representation, which is compact and discriminative. For ultra-large …
Miper-MVS: Multi-scale iterative probability estimation with refinement for efficient multi-view stereo
Multi-view stereo reconstruction aims to construct 3D scenes from multiple 2D images. In
recent years, learning-based multi-view stereo methods have achieved significant results in …
recent years, learning-based multi-view stereo methods have achieved significant results in …
DRL: Dynamic rebalance learning for adversarial robustness of UAV with long-tailed distribution
Adversarial robustness has attracted extensive studies in various fields by increasing the
interpretability of deep learning and enhancing the understanding of neural network models …
interpretability of deep learning and enhancing the understanding of neural network models …
Improving cross-dimensional weighting pooling with multi-scale feature fusion for image retrieval
In this paper, we aim to achieve effective image representation for image retrieval in an
unsupervised manner. To this end, we propose a fully cross-dimensional weighting pooling …
unsupervised manner. To this end, we propose a fully cross-dimensional weighting pooling …