Towards Text-Image Interleaved Retrieval

X Zhang, Z Dai, Y Li, Y Zhang, D Long, P **e… - arxiv preprint arxiv …, 2025 - arxiv.org
Current multimodal information retrieval studies mainly focus on single-image inputs, which
limits real-world applications involving multiple images and text-image interleaved content …