Google Acadèmic

Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling

Z Chen, W Wang, Y Cao, Y Liu, Z Gao, E Cui… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce InternVL 2.5, an advanced multimodal large language model (MLLM) series
that builds upon InternVL 2.0, maintaining its core model architecture while introducing …

Desa Cita Citat per 37 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Imitate, explore, and self-improve: A reproduction report on slow-thinking reasoning systems

Y Min, Z Chen, J Jiang, J Chen, J Deng, Y Hu… - arxiv preprint arxiv …, 2024 - arxiv.org

Recently, slow-thinking reasoning systems, such as o1, have demonstrated remarkable
capabilities in solving complex reasoning tasks. These systems typically engage in an …

Desa Cita Citat per 16 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Process reinforcement through implicit rewards

G Cui, L Yuan, Z Wang, H Wang, W Li, B He… - arxiv preprint arxiv …, 2025 - arxiv.org

Dense process rewards have proven a more effective alternative to the sparse outcome-
level rewards in the inference-time scaling of large language models (LLMs), particularly in …

Desa Cita Citat per 6 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Critique fine-tuning: Learning to critique is more effective than learning to imitate

Y Wang, X Yue, W Chen - arxiv preprint arxiv:2501.17703, 2025 - arxiv.org

Supervised Fine-Tuning (SFT) is commonly used to train language models to imitate
annotated responses for given instructions. In this paper, we challenge this paradigm and …

Desa Cita Citat per 2 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Formal mathematical reasoning: A new frontier in ai

K Yang, G Poesia, J He, W Li, K Lauter… - arxiv preprint arxiv …, 2024 - arxiv.org

AI for Mathematics (AI4Math) is not only intriguing intellectually but also crucial for AI-driven
discovery in science, engineering, and beyond. Extensive efforts on AI4Math have mirrored …

Desa Cita Citat per 4 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LIMO: Less is More for Reasoning

Y Ye, Z Huang, Y **ao, E Chern, S **a, P Liu - arxiv preprint arxiv …, 2025 - arxiv.org

We present a fundamental discovery that challenges our understanding of how complex
reasoning emerges in large language models. While conventional wisdom suggests that …

Desa Cita Citat per 4 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Technical report: Enhancing llm reasoning with reward-guided tree search

J Jiang, Z Chen, Y Min, J Chen, X Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org

Recently, test-time scaling has garnered significant attention from the research community,
largely due to the substantial advancements of the o1 model released by OpenAI. By …

Desa Cita Citat per 1 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

C Lyu, S Gao, Y Gu, W Zhang, J Gao, K Liu… - arxiv preprint arxiv …, 2025 - arxiv.org

Reasoning abilities, especially those for solving complex math problems, are crucial
components of general intelligence. Recent advances by proprietary companies, such as o …

Desa Cita Articles relacionats Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

B Zhang, K Li, Z Cheng, Z Hu, Y Yuan, G Chen… - arxiv preprint arxiv …, 2025 - arxiv.org

In this paper, we propose VideoLLaMA3, a more advanced multimodal foundation model for
image and video understanding. The core design philosophy of VideoLLaMA3 is vision …

Desa Cita Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

InfiR: Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

C **e, S Cai, W Wang, P Li, Z Sang, K Yang… - arxiv preprint arxiv …, 2025 - arxiv.org

Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have
made significant advancements in reasoning capabilities. However, they still face …

Desa Cita Articles relacionats Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Numinamath: The largest public dataset in ai4maths with 860k pairs of competition math problems...

Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling

Imitate, explore, and self-improve: A reproduction report on slow-thinking reasoning systems

Process reinforcement through implicit rewards

Critique fine-tuning: Learning to critique is more effective than learning to imitate

Formal mathematical reasoning: A new frontier in ai

LIMO: Less is More for Reasoning

Technical report: Enhancing llm reasoning with reward-guided tree search

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

InfiR: Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning