Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models
The advent of large language models (LLMs) and their adoption by the legal community has
given rise to the question: what types of legal reasoning can LLMs perform? To enable …
given rise to the question: what types of legal reasoning can LLMs perform? To enable …
On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
Domino: Discovering systematic errors with cross-modal embeddings
Machine learning models that achieve high overall accuracy often make systematic errors
on important subsets (or slices) of data. Identifying underperforming slices is particularly …
on important subsets (or slices) of data. Identifying underperforming slices is particularly …
Robustness gym: Unifying the NLP evaluation landscape
Despite impressive performance on standard benchmarks, deep neural networks are often
brittle when deployed in real-world systems. Consequently, recent research has focused on …
brittle when deployed in real-world systems. Consequently, recent research has focused on …
Rng-kbqa: Generation augmented iterative ranking for knowledge base question answering
Existing KBQA approaches, despite achieving strong performance on iid test data, often
struggle in generalizing to questions involving unseen KB schema items. Prior ranking …
struggle in generalizing to questions involving unseen KB schema items. Prior ranking …
Refined: An efficient zero-shot-capable approach to end-to-end entity linking
We introduce ReFinED, an efficient end-to-end entity linking model which uses fine-grained
entity types and entity descriptions to perform linking. The model performs mention …
entity types and entity descriptions to perform linking. The model performs mention …
ReTraCk: A flexible and efficient framework for knowledge base question answering
Abstract We present Retriever-Transducer-Checker (ReTraCk), a neural semantic parsing
framework for large scale knowledge base question answering (KBQA). ReTraCk is …
framework for large scale knowledge base question answering (KBQA). ReTraCk is …
Saga: A platform for continuous construction and serving of knowledge at scale
We introduce Saga, a next-generation knowledge construction and serving platform for
powering knowledge-based applications at industrial scale. Saga follows a hybrid batch …
powering knowledge-based applications at industrial scale. Saga follows a hybrid batch …
[HTML][HTML] How can voting mechanisms improve the robustness and generalizability of toponym disambiguation?
Natural language texts, such as tweets and news, contain a vast amount of geospatial
information, which can be extracted by first recognizing toponyms in texts (toponym …
information, which can be extracted by first recognizing toponyms in texts (toponym …
Evaluating entity disambiguation and the role of popularity in retrieval-based NLP
Retrieval is a core component for open-domain NLP tasks. In open-domain tasks, multiple
entities can share a name, making disambiguation an inherent yet under-explored problem …
entities can share a name, making disambiguation an inherent yet under-explored problem …