Mantis: Interleaved multi-image instruction tuning

D Jiang, X He, H Zeng, C Wei, M Ku, Q Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large multimodal models (LMMs) have shown great results in single-image vision language
tasks. However, their abilities to solve multi-image visual language tasks is yet to be …

NTIRE 2024 quality assessment of AI-generated content challenge

X Liu, X Min, G Zhai, C Li, T Kou… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content
Challenge which will be held in conjunction with the New Trends in Image Restoration and …

A comprehensive study of multimodal large language models for image quality assessment

T Wu, K Ma, J Liang, Y Yang, L Zhang - European Conference on …, 2024 - Springer
Abstract While Multimodal Large Language Models (MLLMs) have experienced significant
advancement in visual understanding and reasoning, their potential to serve as powerful …

Aigiqa-20k: A large database for ai-generated image quality assessment

C Li, T Kou, Y Gao, Y Cao, W Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com
With the rapid advancements in AI-Generated Content (AIGC) AI-Generated Images (AIGIs)
have been widely applied in entertainment education and social media. However due to the …

Depicting beyond scores: Advancing image quality assessment through multi-modal language models

Z You, Z Li, J Gu, Z Yin, T Xue, C Dong - European Conference on …, 2024 - Springer
We introduce a Depict ed image Q uality A ssessment method (DepictQA), overcoming the
constraints of traditional score-based methods. DepictQA allows for detailed, language …

Quality assessment in the era of large models: A survey

Z Zhang, Y Zhou, C Li, B Zhao, X Liu, G Zhai - arxiv preprint arxiv …, 2024 - arxiv.org
Quality assessment, which evaluates the visual quality level of multimedia experiences, has
garnered significant attention from researchers and has evolved substantially through …

Mmdu: A multi-turn multi-image dialog understanding benchmark and instruction-tuning dataset for lvlms

Z Liu, T Chu, Y Zang, X Wei, X Dong, P Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Generating natural and meaningful responses to communicate with multi-modal human
inputs is a fundamental capability of Large Vision-Language Models (LVLMs). While current …

Review of image quality assessment methods for compressed images

S Jamil - Journal of Imaging, 2024 - mdpi.com
The compression of images for efficient storage and transmission is crucial in handling large
data volumes. Lossy image compression reduces storage needs but introduces perceptible …

Aigc-vqa: A holistic perception metric for aigc video quality assessment

Y Lu, X Li, B Li, Z Yu, F Guan, X Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com
With the development of generative models such as the diffusion model and auto-regressive
model AI-generated content (AIGC) is experiencing an explosive growth. Moreover existing …

Q-bench: A benchmark for multi-modal foundation models on low-level vision from single images to pairs

Z Zhang, H Wu, E Zhang, G Zhai… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The rapid development of Multi-modality Large Language Models (MLLMs) has navigated a
paradigm shift in computer vision, moving towards versatile foundational models. However …