Efficient attention: Attention with linear complexities
Dot-product attention has wide applications in computer vision and natural language
processing. However, its memory and computational costs grow quadratically with the input …
processing. However, its memory and computational costs grow quadratically with the input …
Self-supervision with superpixels: Training few-shot medical image segmentation without annotation
Few-shot semantic segmentation (FSS) has great potential for medical imaging applications.
Most of the existing FSS techniques require abundant annotated semantic classes for …
Most of the existing FSS techniques require abundant annotated semantic classes for …
Few-shot object detection: Research advances and challenges
Object detection as a subfield within computer vision has achieved remarkable progress,
which aims to accurately identify and locate a specific object from images or videos. Such …
which aims to accurately identify and locate a specific object from images or videos. Such …
Self-supervised learning for few-shot medical image segmentation
Fully-supervised deep learning segmentation models are inflexible when encountering new
unseen semantic classes and their fine-tuning often requires significant amounts of …
unseen semantic classes and their fine-tuning often requires significant amounts of …
The dawn of quantum natural language processing
In this paper, we discuss the initial attempts at boosting understanding human language
based on deep-learning models with quantum computing. We successfully train a quantum …
based on deep-learning models with quantum computing. We successfully train a quantum …
Optimizing numerical estimation and operational efficiency in the legal domain through large language models
The legal landscape encompasses a wide array of lawsuit types, presenting lawyers with
challenges in delivering timely and accurate information to clients, particularly concerning …
challenges in delivering timely and accurate information to clients, particularly concerning …
Expert-defined keywords improve interpretability of retinal image captioning
Automatic machine learning-based (ML-based) medical report generation systems for retinal
images suffer from a relative lack of interpretability. Hence, such ML-based systems are still …
images suffer from a relative lack of interpretability. Hence, such ML-based systems are still …
A novel evaluation framework for image2text generation
Evaluating the quality of automatically generated image descriptions is challenging,
requiring metrics that capture various aspects such as grammaticality, coverage …
requiring metrics that capture various aspects such as grammaticality, coverage …
Query-controllable video summarization
When video collections become huge, how to explore both within and across videos
efficiently is challenging. Video summarization is one of the ways to tackle this issue …
efficiently is challenging. Video summarization is one of the ways to tackle this issue …
Deepopht: medical report generation for retinal images via deep models and visual explanation
In this work, we propose an AI-based method that intends to improve the conventional retinal
disease treatment procedure and help ophthalmologists increase diagnosis efficiency and …
disease treatment procedure and help ophthalmologists increase diagnosis efficiency and …