Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension
Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …
there has been much work on benchmark datasets needed to track modeling progress …
Large language model for table processing: A survey
Tables, typically two-dimensional and structured to store large amounts of data, are
essential in daily activities like database queries, spreadsheet manipulations, Web table …
essential in daily activities like database queries, spreadsheet manipulations, Web table …
Making language models better reasoners with step-aware verifier
Few-shot learning is a challenging task that requires language models to generalize from
limited examples. Large language models like GPT-3 and PaLM have made impressive …
limited examples. Large language models like GPT-3 and PaLM have made impressive …
Agentbench: Evaluating llms as agents
Large Language Models (LLMs) are becoming increasingly smart and autonomous,
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …
targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has …
Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning
Mathematical reasoning, a core ability of human intelligence, presents unique challenges for
machines in abstract thinking and logical reasoning. Recent large pre-trained language …
machines in abstract thinking and logical reasoning. Recent large pre-trained language …
Folio: Natural language reasoning with first-order logic
Large language models (LLMs) have achieved remarkable performance on a variety of
natural language understanding tasks. However, existing benchmarks are inadequate in …
natural language understanding tasks. However, existing benchmarks are inadequate in …
Finqa: A dataset of numerical reasoning over financial data
The sheer volume of financial statements makes it difficult for humans to access and analyze
a business's financials. Robust numerical reasoning likewise faces unique challenges in this …
a business's financials. Robust numerical reasoning likewise faces unique challenges in this …
TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance
Hybrid data combining both tabular and textual content (eg, financial reports) are quite
pervasive in the real world. However, Question Answering (QA) over such hybrid data is …
pervasive in the real world. However, Question Answering (QA) over such hybrid data is …
Large language models are few (1)-shot table reasoners
W Chen - arxiv preprint arxiv:2210.06710, 2022 - arxiv.org
Recent literature has shown that large language models (LLMs) are generally excellent few-
shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table …
shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table …
♫ MuSiQue: Multihop Questions via Single-hop Question Composition
Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to
be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by …
be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by …