- Academic Search

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org

Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

Zapisz Cytuj Cytowane przez 232 Powiązane artykuły Wszystkie wersje 6

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Large language model for table processing: A survey

W Lu, J Zhang, J Fan, Z Fu, Y Chen, X Du - Frontiers of Computer Science, 2025 - Springer

Tables, typically two-dimensional and structured to store large amounts of data, are
essential in daily activities like database queries, spreadsheet manipulations, Web table …

Zapisz Cytuj Cytowane przez 22 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Making language models better reasoners with step-aware verifier

Y Li, Z Lin, S Zhang, Q Fu, B Chen… - Proceedings of the …, 2023 - aclanthology.org

Few-shot learning is a challenging task that requires language models to generalize from
limited examples. Large language models like GPT-3 and PaLM have made impressive …

Zapisz Cytuj Cytowane przez 166 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Agentbench: Evaluating llms as agents

X Liu, H Yu, H Zhang, Y Xu, X Lei, H Lai, Y Gu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) are becoming increasingly smart and autonomous,
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …

Zapisz Cytuj Cytowane przez 260 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning

P Lu, L Qiu, KW Chang, YN Wu, SC Zhu… - arxiv preprint arxiv …, 2022 - arxiv.org

Mathematical reasoning, a core ability of human intelligence, presents unique challenges for
machines in abstract thinking and logical reasoning. Recent large pre-trained language …

Zapisz Cytuj Cytowane przez 226 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Folio: Natural language reasoning with first-order logic

S Han, H Schoelkopf, Y Zhao, Z Qi, M Riddell… - arxiv preprint arxiv …, 2022 - arxiv.org

Large language models (LLMs) have achieved remarkable performance on a variety of
natural language understanding tasks. However, existing benchmarks are inadequate in …

Zapisz Cytuj Cytowane przez 106 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Finqa: A dataset of numerical reasoning over financial data

Z Chen, W Chen, C Smiley, S Shah, I Borova… - arxiv preprint arxiv …, 2021 - arxiv.org

The sheer volume of financial statements makes it difficult for humans to access and analyze
a business's financials. Robust numerical reasoning likewise faces unique challenges in this …

Zapisz Cytuj Cytowane przez 267 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance

F Zhu, W Lei, Y Huang, C Wang, S Zhang, J Lv… - arxiv preprint arxiv …, 2021 - arxiv.org

Hybrid data combining both tabular and textual content (eg, financial reports) are quite
pervasive in the real world. However, Question Answering (QA) over such hybrid data is …

Zapisz Cytuj Cytowane przez 242 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large language models are few (1)-shot table reasoners

W Chen - arxiv preprint arxiv:2210.06710, 2022 - arxiv.org

Recent literature has shown that large language models (LLMs) are generally excellent few-
shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table …

Zapisz Cytuj Cytowane przez 141 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

♫ MuSiQue: Multihop Questions via Single-hop Question Composition

H Trivedi, N Balasubramanian, T Khot… - Transactions of the …, 2022 - direct.mit.edu

Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to
be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by …

Zapisz Cytuj Cytowane przez 236 Powiązane artykuły Wszystkie wersje 8

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Hybridqa: A dataset of multi-hop question answering over tabular and textual data

Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension

Large language model for table processing: A survey

Making language models better reasoners with step-aware verifier

Agentbench: Evaluating llms as agents

Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning

Folio: Natural language reasoning with first-order logic

Finqa: A dataset of numerical reasoning over financial data

TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance

Large language models are few (1)-shot table reasoners

♫ MuSiQue: Multihop Questions via Single-hop Question Composition