Contemporary approaches in evolving language models
This article provides a comprehensive survey of contemporary language modeling
approaches within the realm of natural language processing (NLP) tasks. This paper …
approaches within the realm of natural language processing (NLP) tasks. This paper …
SinKD: Sinkhorn Distance Minimization for Knowledge Distillation
Knowledge distillation (KD) has been widely adopted to compress large language models
(LLMs). Existing KD methods investigate various divergence measures including the …
(LLMs). Existing KD methods investigate various divergence measures including the …
[PDF][PDF] Predict the Next Word:< Humans Exhibit Uncertainty in this Task and Language Models _>
Abstract Language models (LMs) are statistical models trained to assign probability to
humangenerated text. As such, it is reasonable to question whether they approximate …
humangenerated text. As such, it is reasonable to question whether they approximate …
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling
Neural language models are probabilistic models of human text. They are predominantly
trained using maximum likelihood estimation (MLE), which is equivalent to minimizing the …
trained using maximum likelihood estimation (MLE), which is equivalent to minimizing the …
Transparency at the source: Evaluating and interpreting language models with access to the true distribution
We present a setup for training, evaluating and interpreting neural language models, that
uses artificial, language-like data. The data is generated using a massive probabilistic …
uses artificial, language-like data. The data is generated using a massive probabilistic …
Beyond MLE: convex learning for text generation
Maximum likelihood estimation (MLE) is a statistical method used to estimate the parameters
of a probability distribution that best explain the observed data. In the context of text …
of a probability distribution that best explain the observed data. In the context of text …
An improved two-stage zero-shot relation triplet extraction model with hybrid cross-entropy loss and discriminative reranking
D Li, L Zhang, J Zhou, J Huang, N **ong… - Expert Systems with …, 2025 - Elsevier
Zero-shot relation triplet extraction (ZeroRTE) aims to extract relation triplets from
unstructured text under zero-shot conditions, where the relation sets in the training and …
unstructured text under zero-shot conditions, where the relation sets in the training and …
FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios
K Pang - arxiv preprint arxiv:2412.19652, 2024 - arxiv.org
Linguistic steganography embeds secret information in seemingly innocent texts,
safeguarding privacy in surveillance environments. Generative linguistic steganography …
safeguarding privacy in surveillance environments. Generative linguistic steganography …
Predict the Next Word
Language models (LMs) are statistical models trained to assign probability to human-
generated text. As such, it is reasonable to question whether they approximate linguistic …
generated text. As such, it is reasonable to question whether they approximate linguistic …
Finding structure in language models
J Jumelet - arxiv preprint arxiv:2411.16433, 2024 - arxiv.org
When we speak, write or listen, we continuously make predictions based on our knowledge
of a language's grammar. Remarkably, children acquire this grammatical knowledge within …
of a language's grammar. Remarkably, children acquire this grammatical knowledge within …