Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models

N Guha, J Nyarko, D Ho, C Ré… - Advances in …, 2023 - proceedings.neurips.cc
The advent of large language models (LLMs) and their adoption by the legal community has
given rise to the question: what types of legal reasoning can LLMs perform? To enable …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Domino: Discovering systematic errors with cross-modal embeddings

S Eyuboglu, M Varma, K Saab, JB Delbrouck… - arxiv preprint arxiv …, 2022 - arxiv.org
Machine learning models that achieve high overall accuracy often make systematic errors
on important subsets (or slices) of data. Identifying underperforming slices is particularly …

Robustness gym: Unifying the NLP evaluation landscape

K Goel, N Rajani, J Vig, S Tan, J Wu, S Zheng… - arxiv preprint arxiv …, 2021 - arxiv.org
Despite impressive performance on standard benchmarks, deep neural networks are often
brittle when deployed in real-world systems. Consequently, recent research has focused on …

Rng-kbqa: Generation augmented iterative ranking for knowledge base question answering

X Ye, S Yavuz, K Hashimoto, Y Zhou… - arxiv preprint arxiv …, 2021 - arxiv.org
Existing KBQA approaches, despite achieving strong performance on iid test data, often
struggle in generalizing to questions involving unseen KB schema items. Prior ranking …

Refined: An efficient zero-shot-capable approach to end-to-end entity linking

T Ayoola, S Tyagi, J Fisher… - arxiv preprint arxiv …, 2022 - arxiv.org
We introduce ReFinED, an efficient end-to-end entity linking model which uses fine-grained
entity types and entity descriptions to perform linking. The model performs mention …

ReTraCk: A flexible and efficient framework for knowledge base question answering

S Chen, Q Liu, Z Yu, CY Lin, JG Lou… - Proceedings of the 59th …, 2021 - aclanthology.org
Abstract We present Retriever-Transducer-Checker (ReTraCk), a neural semantic parsing
framework for large scale knowledge base question answering (KBQA). ReTraCk is …

Saga: A platform for continuous construction and serving of knowledge at scale

IF Ilyas, T Rekatsinas, V Konda, J Pound, X Qi… - Proceedings of the …, 2022 - dl.acm.org
We introduce Saga, a next-generation knowledge construction and serving platform for
powering knowledge-based applications at industrial scale. Saga follows a hybrid batch …

[HTML][HTML] How can voting mechanisms improve the robustness and generalizability of toponym disambiguation?

X Hu, Y Sun, J Kersten, Z Zhou, F Klan, H Fan - International Journal of …, 2023 - Elsevier
Natural language texts, such as tweets and news, contain a vast amount of geospatial
information, which can be extracted by first recognizing toponyms in texts (toponym …

Evaluating entity disambiguation and the role of popularity in retrieval-based NLP

A Chen, P Gudipati, S Longpre, X Ling… - arxiv preprint arxiv …, 2021 - arxiv.org
Retrieval is a core component for open-domain NLP tasks. In open-domain tasks, multiple
entities can share a name, making disambiguation an inherent yet under-explored problem …