Parrot captions teach clip to spot text
Despite CLIP being the foundation model in numerous vision-language applications, CLIP
suffers from a severe text spotting bias. Such bias causes CLIP models to 'Parrot'the visual …
suffers from a severe text spotting bias. Such bias causes CLIP models to 'Parrot'the visual …
Improving Geo-Diversity of Generated Images with Contextualized Vendi Score Guidance
With the growing popularity of text-to-image generative models, there has been increasing
focus on understanding their risks and biases. Recent work has found that state-of-the-art …
focus on understanding their risks and biases. Recent work has found that state-of-the-art …
Can CLIP Count Stars? An Empirical Study on Quantity Bias in CLIP
CLIP has demonstrated great versatility in adapting to various downstream tasks, such as
image editing and generation, visual question answering, and video understanding …
image editing and generation, visual question answering, and video understanding …
Mechanistic understanding and validation of large AI models with SemanticLens
Unlike human-engineered systems such as aeroplanes, where each component's role and
dependencies are well understood, the inner workings of AI models remain largely opaque …
dependencies are well understood, the inner workings of AI models remain largely opaque …
Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era
The rapid advancement of large language models (LLMs) and multimodal learning has
transformed digital content creation and manipulation. Traditional visual editing tools require …
transformed digital content creation and manipulation. Traditional visual editing tools require …
Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation
Bias in Foundation Models (FMs)-trained on vast datasets spanning societal and historical
knowledge-poses significant challenges for fairness and equity across fields such as …
knowledge-poses significant challenges for fairness and equity across fields such as …
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP
Large-scale vision-language models, such as CLIP, are known to contain harmful societal
bias regarding protected attributes (eg, gender and age). In this paper, we aim to address …
bias regarding protected attributes (eg, gender and age). In this paper, we aim to address …
[BOOK][B] Computer Vision-ECCV 2024: 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XXIV.
A Leonardis - 2024 - books.google.com
The multi-volume set of LNCS books with volume numbers 15059 up to 15147 constitutes
the refereed proceedings of the 18th European Conference on Computer Vision, ECCV …
the refereed proceedings of the 18th European Conference on Computer Vision, ECCV …
[PDF][PDF] Generative Artificial Intelligence and Digital Ageism: Exploring the Construction of Age and Aging by Image-Generating AI
T Kamelski, D Klinge - 2024 - osf.io
Since 2022, the growing attention to and public accessibility of generative artificial
intelligence (AI) have become essential for knowledge acquisition on digital platforms …
intelligence (AI) have become essential for knowledge acquisition on digital platforms …