A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges

M Salehi, H Mirzaei, D Hendrycks, Y Li… - arxiv preprint arxiv …, 2021 - arxiv.org
Machine learning models often encounter samples that are diverged from the training
distribution. Failure to recognize an out-of-distribution (OOD) sample, and consequently …

Promptad: Zero-shot anomaly detection using text prompts

Y Li, A Goodge, F Liu, CS Foo - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
We target the problem of zero-shot anomaly detection, in which a model is pre-trained on a
set of seen classes and expected to detect anomalies in other unseen classes at test time …

AMAM: an attention-based multimodal alignment model for medical visual question answering

H Pan, S He, K Zhang, B Qu, C Chen, K Shi - Knowledge-Based Systems, 2022 - Elsevier
Abstract Medical Visual Question Answering (VQA) is a multimodal task to answer clinical
questions about medical images. Existing methods have achieved good performance, but …

Reweighted regularized prototypical network for few-shot fault diagnosis

K Li, C Shang, H Ye - IEEE Transactions on Neural Networks …, 2022 - ieeexplore.ieee.org
In this article, we study the challenging few-shot fault diagnosis (FSFD) problem where
limited faulty samples are available. Metric-based meta-learning methods have been a …

Coca: Collaborative causal regularization for audio-visual question answering

M Lao, N Pu, Y Liu, K He, EM Bakker… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Abstract Audio-Visual Question Answering (AVQA) is a sophisticated QA task, which aims at
answering textual questions over given video-audio pairs with comprehensive multimodal …

Rare Category Analysis for Complex Data: A Review

D Zhou, J He - ACM Computing Surveys, 2023 - dl.acm.org
Though the sheer volume of data that is collected is immense, it is the rare categories that
are often the most important in many high-impact domains, ranging from financial fraud …

Benchmarking out-of-distribution detection in visual question answering

X Shi, S Lee - Proceedings of the IEEE/CVF Winter …, 2024 - openaccess.thecvf.com
When faced with an out-of-distribution (OOD) question or image, visual question answering
(VQA) systems may provide unreliable answers. If relied on by real users or secondary …

Deep residual weight-sharing attention network with low-rank attention for visual question answering

B Qin, H Hu, Y Zhuang - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org
The attention-based networks have become prevailing recently in visual question answering
(VQA) due to their high performances. However, the extensive memory consumption of …

Intra-and inter-instance location correlation network for human–object interaction detection

M Lu, G Yang, Y Wang, K Luo - Engineering Applications of Artificial …, 2025 - Elsevier
Objective: Human–object interaction detection is to detect human–object pairs and identify
their interactions, which is of great significance to improve the perception and decision …

DE-GAN: Text-to-image synthesis with dual and efficient fusion model

B Jiang, W Zeng, C Yang, R Wang, B Zhang - Multimedia Tools and …, 2024 - Springer
Generating diverse and plausible images conditioned on the given captions is an attractive
but challenging task. While many existing studies have presented impressive results, text-to …