VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
C Zhang, C Wang, Y Zhou, Y Peng - arxiv preprint arxiv:2502.00711, 2025 - arxiv.org
Visual reasoning refers to the task of solving questions about visual information. Current
visual reasoning methods typically employ pre-trained vision-language model (VLM) …
visual reasoning methods typically employ pre-trained vision-language model (VLM) …