- Academic Search

D Surís, S Menon, C Vondrick - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Answering visual queries is a complex task that requires both visual processing and
reasoning. End-to-end models, the dominant approach for this task, do not explicitly …

Opslaan Citeren Geciteerd door 420 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards trustworthy and aligned machine learning: A data-centric survey with causality perspectives

H Liu, M Chaudhary, H Wang - arxiv preprint arxiv:2307.16851, 2023 - arxiv.org

The trustworthiness of machine learning has emerged as a critical topic in the field,
encompassing various applications and research areas such as robustness, security …

Opslaan Citeren Geciteerd door 25 Verwante artikelen Alle 3 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Waffling around for performance: Visual classification with random words and broad concepts

K Roth, JM Kim, A Koepke, O Vinyals… - Proceedings of the …, 2023 - openaccess.thecvf.com

The visual classification performance of vision-language models such as CLIP has been
shown to benefit from additional semantic knowledge from large language models (LLMs) …

Opslaan Citeren Geciteerd door 65 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning without forgetting for vision-language models

DW Zhou, Y Zhang, Y Wang, J Ning… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

Class-Incremental Learning (CIL) or continual learning is a desired capability in the real
world, which requires a learning system to adapt to new tasks without forgetting former ones …

Opslaan Citeren Geciteerd door 42 Verwante artikelen Alle 4 versies

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cotdet: Affordance knowledge prompting for task driven object detection

J Tang, G Zheng, J Yu, S Yang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Task driven object detection aims to detect object instances suitable for affording a task in an
image. Its challenge lies in object categories available for the task being too diverse to be …

Opslaan Citeren Geciteerd door 20 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[HTML] primeagencyltd.com

Prompt learning in computer vision: a survey

Y Lei, J Li, Z Li, Y Cao, H Shan - Frontiers of Information Technology & …, 2024 - Springer

Prompt learning has attracted broad attention in computer vision since the large pre-trained
vision-language models (VLMs) exploded. Based on the close relationship between vision …

Opslaan Citeren Geciteerd door 8 Verwante artikelen Alle 3 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Follow the rules: reasoning for video anomaly detection with large language models

Y Yang, K Lee, B Dariush, Y Cao, SY Lo - European Conference on …, 2024 - Springer

Abstract Video Anomaly Detection (VAD) is crucial for applications such as security
surveillance and autonomous driving. However, existing VAD methods provide little …

Opslaan Citeren Geciteerd door 10 Verwante artikelen Alle 7 versies

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bridge the Modality and Capability Gaps in Vision-Language Model Selection

C Yi, Y He, DC Zhan, HJ Ye - Advances in Neural …, 2025 - proceedings.neurips.cc

Abstract Vision Language Models (VLMs) excel in zero-shot image classification by pairing
images with textual category names. The expanding variety of Pre-Trained VLMs enhances …

Opslaan Citeren Geciteerd door 8 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification

C Yi, L Ren, DC Zhan, HJ Ye - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

CLIP showcases exceptional cross-modal matching capabilities due to its training on image-
text contrastive learning tasks. However without specific optimization for unimodal scenarios …

Opslaan Citeren Geciteerd door 5 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Convolutional Prompting meets Language Models for Continual Learning

A Roy, R Moulick, VK Verma… - Proceedings of the …, 2024 - openaccess.thecvf.com

Continual Learning (CL) enables machine learning models to learn from continuously
shifting new training data in absence of data from old tasks. Recently pre-trained vision …

Opslaan Citeren Geciteerd door 9 Verwante artikelen Alle 4 versies HTML-versie

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Doubly right object recognition: A why prompt for visual rationales

Vipergpt: Visual inference via python execution for reasoning

Towards trustworthy and aligned machine learning: A data-centric survey with causality perspectives

Waffling around for performance: Visual classification with random words and broad concepts

Learning without forgetting for vision-language models

Cotdet: Affordance knowledge prompting for task driven object detection

Prompt learning in computer vision: a survey

Follow the rules: reasoning for video anomaly detection with large language models

Bridge the Modality and Capability Gaps in Vision-Language Model Selection

Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification

Convolutional Prompting meets Language Models for Continual Learning