A primer on neural network models for natural language processing

Y Goldberg - Journal of Artificial Intelligence Research, 2016 - jair.org
Over the past few years, neural networks have re-emerged as powerful machine-learning
models, yielding state-of-the-art results in fields such as image recognition and speech …

Kernel methods in machine learning

T Hofmann, B Schölkopf, AJ Smola - 2008 - projecteuclid.org
We review machine learning methods employing positive definite kernels. These methods
formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of …

Lever: Learning to verify language-to-code generation with execution

A Ni, S Iyer, D Radev, V Stoyanov… - International …, 2023 - proceedings.mlr.press
The advent of large language models trained on code (code LLMs) has led to significant
progress in language-to-code generation. State-of-the-art approaches in this area combine …

Lift yourself up: Retrieval-augmented text generation with self-memory

X Cheng, D Luo, X Chen, L Liu… - Advances in Neural …, 2023 - proceedings.neurips.cc
With direct access to human-written reference as memory, retrieval-augmented generation
has achieved much progress in a wide range of text generation tasks. Since better memory …

[КНИГА][B] Neural network methods in natural language processing

Y Goldberg - 2017 - books.google.com
Neural networks are a family of powerful machine learning models and this book focuses on
their application to natural language data. The first half of the book (Parts I and II) covers the …

SummaReranker: A multi-task mixture-of-experts re-ranking framework for abstractive summarization

M Ravaut, S Joty, NF Chen - arxiv preprint arxiv:2203.06569, 2022 - arxiv.org
Sequence-to-sequence neural networks have recently achieved great success in abstractive
summarization, especially through fine-tuning large pre-trained language models on the …

Cogltx: Applying bert to long texts

M Ding, C Zhou, H Yang, J Tang - Advances in Neural …, 2020 - proceedings.neurips.cc
BERTs are incapable of processing long texts due to its quadratically increasing memory
and time consumption. The straightforward thoughts to address this problem, such as slicing …

Going out on a limb: Joint extraction of entity mentions and relations without dependency trees

A Katiyar, C Cardie - Proceedings of the 55th Annual Meeting of …, 2017 - aclanthology.org
We present a novel attention-based recurrent neural network for joint extraction of entity
mentions and relations. We show that attention along with long short term memory (LSTM) …

SpanNER: Named entity re-/recognition as span prediction

J Fu, X Huang, P Liu - arxiv preprint arxiv:2106.00641, 2021 - arxiv.org
Recent years have seen the paradigm shift of Named Entity Recognition (NER) systems
from sequence labeling to span prediction. Despite its preliminary effectiveness, the span …

Rewarding progress: Scaling automated process verifiers for llm reasoning

A Setlur, C Nagpal, A Fisch, X Geng… - arxiv preprint arxiv …, 2024 - arxiv.org
A promising approach for improving reasoning in large language models is to use process
reward models (PRMs). PRMs provide feedback at each step of a multi-step reasoning trace …