- Academic Search

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arxiv preprint arxiv …, 2022 - arxiv.org

Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

保存引用被引用数: 63 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Llava-onevision: Easy visual task transfer

B Li, Y Zhang, D Guo, R Zhang, F Li, H Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed
by consolidating our insights into data, models, and visual representations in the LLaVA …

保存引用被引用数: 248 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

S Tong, E Brown, P Wu, S Woo, M Middepogu… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-
centric approach. While stronger language models can enhance multimodal capabilities, the …

保存引用被引用数: 170 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

A survey of table reasoning with large language models

X Zhang, D Wang, L Dou, Q Zhu, W Che - Frontiers of Computer Science, 2025 - Springer

Table reasoning aims to generate inference results based on the user requirement and the
provided table. Enhancing the table reasoning capability of the model can aid in obtaining …

保存引用被引用数: 8 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

MultiHiertt: Numerical reasoning over multi hierarchical tabular and textual data

Y Zhao, Y Li, C Li, R Zhang - arxiv preprint arxiv:2206.01347, 2022 - arxiv.org

Numerical reasoning over hybrid data containing both textual and tabular content (eg,
financial reports) has recently attracted much attention in the NLP community. However …

保存引用被引用数: 89 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Nvlm: Open frontier-class multimodal llms

W Dai, N Lee, B Wang, Z Yang, Z Liu, J Barker… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce NVLM 1.0, a family of frontier-class multimodal large language models (LLMs)
that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary …

保存引用被引用数: 21 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Mm1. 5: Methods, analysis & insights from multimodal llm fine-tuning

H Zhang, M Gao, Z Gan, P Dufter, N Wenzel… - arxiv preprint arxiv …, 2024 - arxiv.org

We present MM1. 5, a new family of multimodal large language models (MLLMs) designed
to enhance capabilities in text-rich image understanding, visual referring and grounding …

保存引用被引用数: 15 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Tablellama: Towards open large generalist models for tables

T Zhang, X Yue, Y Li, H Sun - arxiv preprint arxiv:2311.09206, 2023 - arxiv.org

Semi-structured tables are ubiquitous. There has been a variety of tasks that aim to
automatically interpret, augment, and query tables. Current methods often require …

保存引用被引用数: 71 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

A survey of reasoning with foundation models

J Sun, C Zheng, E **e, Z Liu, R Chu, J Qiu, J Xu… - arxiv preprint arxiv …, 2023 - arxiv.org

Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-
world settings such as negotiation, medical diagnosis, and criminal investigation. It serves …

保存引用被引用数: 41 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] acm.org

Large language models for tabular data: Progresses and future directions

H Dong, Z Wang - Proceedings of the 47th International ACM SIGIR …, 2024 - dl.acm.org

Tables contain a significant portion of the world's structured information. The ability to
efficiently and accurately understand, process, reason about, analyze, and generate tabular …

保存引用被引用数: 12 関連記事全 2 バージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Hitab: A hierarchical table dataset for question answering and natural language generation

Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

Llava-onevision: Easy visual task transfer

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

A survey of table reasoning with large language models

MultiHiertt: Numerical reasoning over multi hierarchical tabular and textual data

Nvlm: Open frontier-class multimodal llms

Mm1. 5: Methods, analysis & insights from multimodal llm fine-tuning

Tablellama: Towards open large generalist models for tables

A survey of reasoning with foundation models

Large language models for tabular data: Progresses and future directions