Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org
Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

Trick me if you can: Human-in-the-loop generation of adversarial examples for question answering

E Wallace, P Rodriguez, S Feng, I Yamada… - Transactions of the …, 2019 - direct.mit.edu
Adversarial evaluation stress-tests a model's understanding of natural language. Because
past approaches expose superficial patterns, the resulting adversarial examples are limited …

What can ai do for me? evaluating machine learning interpretations in cooperative play

S Feng, J Boyd-Graber - … of the 24th International Conference on …, 2019 - dl.acm.org
Machine learning is an important tool for decision making, but its ethical and responsible
application requires rigorous vetting of its interpretability and utility: an understudied …

Towards a robust deep neural network against adversarial texts: A survey

W Wang, R Wang, L Wang, Z Wang… - ieee transactions on …, 2021 - ieeexplore.ieee.org
Deep neural networks (DNNs) have achieved remarkable success in various tasks (eg,
image classification, speech recognition, and natural language processing (NLP)). However …

Towards a robust deep neural network in texts: A survey

W Wang, R Wang, L Wang, Z Wang, A Ye - arxiv preprint arxiv …, 2019 - arxiv.org
Deep neural networks (DNNs) have achieved remarkable success in various tasks (eg,
image classification, speech recognition, and natural language processing (NLP)). However …

Mastering the ABCDs of Complex Questions: Answer-Based Claim Decomposition for Fine-grained Self-Evaluation

N Balepur, J Huang, S Moorjani, H Sundaram… - arxiv preprint arxiv …, 2023 - arxiv.org
When answering complex questions, large language models (LLMs) may produce answers
that do not satisfy all criteria of the question. While existing self-evaluation techniques aim to …

Mitigating noisy inputs for question answering

D Peskov, J Barrow, P Rodriguez, G Neubig… - arxiv preprint arxiv …, 2019 - arxiv.org
Natural language processing systems are often downstream of unreliable inputs: machine
translation, optical character recognition, or speech recognition. For instance, virtual …

[PDF][PDF] Trick me if you can: Adversarial writing of trivia challenge questions

E Wallace, J Boyd-Graber - ACL Student Research Workshop, 2018 - par.nsf.gov
Modern question answering systems have been touted as approaching human
performance. However, existing question answering datasets are imperfect tests. Questions …

Evaluating Machine Intelligence With Question Answering

P Rodriguez - 2021 - search.proquest.com
Humans ask questions to learn about the world and to test knowledge understanding. The
ability to ask questions combines aspects of intelligence unique to humans: language …

Gathering Natural Language Processing Data Using Experts

D Peskov - 2021 - search.proquest.com
Natural language processing needs substantial data to make robust predictions. Automatic
methods, unspecialized crowds, and domain experts can be used to collect conversational …