Visual tuning
Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …
downstream visual tasks. With the surprising development of pre-trained visual foundation …
One prompt word is enough to boost adversarial robustness for pre-trained vision-language models
Abstract Large pre-trained Vision-Language Models (VLMs) like CLIP despite having
remarkable generalization ability are highly vulnerable to adversarial examples. This work …
remarkable generalization ability are highly vulnerable to adversarial examples. This work …
AGD-GAN: Adaptive Gradient-Guided and Depth-supervised generative adversarial networks for ancient mural sketch extraction
Z Yu, S Peng, S Qu, Q Zhang, J Wang… - Expert Systems with …, 2024 - Elsevier
To address the overlooked issues of multi-scale detail feature extraction and disease noise
suppression in mural sketch extraction, we proposed a novel generative adversarial network …
suppression in mural sketch extraction, we proposed a novel generative adversarial network …
Enhancing object coherence in layout-to-image synthesis
InstaFormer++: Multi-Domain Instance-Aware Image-to-Image Translation with Transformer
We present a novel Transformer-based network architecture for instance-aware image-to-
image translation, dubbed InstaFormer, to effectively integrate global-and instance-level …
image translation, dubbed InstaFormer, to effectively integrate global-and instance-level …
Empowering LLMs for Multi-Page Layout Generation via Consistency-Oriented In-Context Learning
Document layout generation, a burgeoning field of document intelligence, entails positioning
and sizing various elements within given constraints. While significant strides have been …
and sizing various elements within given constraints. While significant strides have been …