Google Academic

H Bansal, N Singhi, Y Yang, F Yin… - Proceedings of the …, 2023 - openaccess.thecvf.com

Multimodal contrastive pretraining has been used to train multimodal representation models,
such as CLIP, on large amounts of paired image-text data. However, previous studies have …

Salvați Citați Citat de 55 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Spurious correlations in machine learning: A survey

W Ye, G Zheng, X Cao, Y Ma, A Zhang - arxiv preprint arxiv:2402.12715, 2024 - arxiv.org

Machine learning systems are known to be sensitive to spurious correlations between non-
essential features of the inputs (eg, background, texture, and secondary objects) and the …

Salvați Citați Citat de 44 ori Articole cu conținut similar Toate cele 2 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Robust learning with progressive data expansion against spurious correlation

Y Deng, Y Yang, B Mirzasoleiman… - Advances in neural …, 2023 - proceedings.neurips.cc

While deep learning models have shown remarkable performance in various tasks, they are
susceptible to learning non-generalizable _spurious features_ rather than the core features …

Salvați Citați Citat de 35 ori Articole cu conținut similar Toate cele 10 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Distilling vision-language models on millions of videos

Y Zhao, L Zhao, X Zhou, J Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

The recent advance in vision-language models is largely attributed to the abundance of
image-text data. We aim to replicate this success for video-language models but there …

Salvați Citați Citat de 15 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Sieve: Multimodal dataset pruning using image captioning models

A Mahmoud, M Elhoushi, A Abbas… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Vision-Language Models (VLMs) are pretrained on large diverse and noisy web-
crawled datasets. This underscores the critical need for dataset pruning as the quality of …

Salvați Citați Citat de 21 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[HTML] nih.gov

Calibrating multi-modal representations: A pursuit of group robustness without annotations

C You, Y Mint, W Dai, JS Sekhon… - 2024 IEEE/CVF …, 2024 - ieeexplore.ieee.org

Fine-tuning pre-trained vision-language models, like CLIP, has yielded success on diverse
downstream tasks. However, several pain points persist for this paradigm:(i) directly tuning …

Salvați Citați Citat de 12 ori Articole cu conținut similar Toate cele 8 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A Sober Look at the Robustness of CLIPs to Spurious Features

Q Wang, Y Lin, Y Chen, L Schmidt… - Advances in Neural …, 2025 - proceedings.neurips.cc

Large vision language models, such as CLIP, demonstrate impressive robustness to
spurious features than single-modal models trained on ImageNet. However, existing test …

Salvați Citați Citat de 2 ori Articole cu conținut similar Toate cele 2 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Fd-align: Feature discrimination alignment for fine-tuning pre-trained models in few-shot learning

K Song, H Ma, B Zou, H Zhang… - Advances in Neural …, 2023 - proceedings.neurips.cc

Due to the limited availability of data, existing few-shot learning methods trained from
scratch fail to achieve satisfactory performance. In contrast, large-scale pre-trained models …

Salvați Citați Citat de 11 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Prompting is a double-edged sword: improving worst-group robustness of foundation models

A Setlur, S Garg, V Smith, S Levine - Forty-first International …, 2024 - openreview.net

Machine learning models fail catastrophically under distribution shift, but a surprisingly
effective way to empirically improve robustness to some types of shift (* eg*, Imagenet-A/C) …

Salvați Citați Citat de 2 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Zero-shot robustification of zero-shot models

D Adila, C Shin, L Cai, F Sala - arxiv preprint arxiv:2309.04344, 2023 - arxiv.org

Zero-shot inference is a powerful paradigm that enables the use of large pretrained models
for downstream classification tasks without further training. However, these models are …

Salvați Citați Citat de 18 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Mitigating spurious correlations in multi-modal models during fine-tuning

Cleanclip: Mitigating data poisoning attacks in multimodal contrastive learning

Spurious correlations in machine learning: A survey

Robust learning with progressive data expansion against spurious correlation

Distilling vision-language models on millions of videos

Sieve: Multimodal dataset pruning using image captioning models

Calibrating multi-modal representations: A pursuit of group robustness without annotations

A Sober Look at the Robustness of CLIPs to Spurious Features

Fd-align: Feature discrimination alignment for fine-tuning pre-trained models in few-shot learning

Prompting is a double-edged sword: improving worst-group robustness of foundation models

Zero-shot robustification of zero-shot models