- Academic Search

D Oralbekova, O Mamyrbayev, M Othman… - Applied Sciences, 2023 - mdpi.com

This article provides a comprehensive survey of contemporary language modeling
approaches within the realm of natural language processing (NLP) tasks. This paper …

Zapisz Cytuj Cytowane przez 23 Powiązane artykuły Wszystkie wersje 3 Kopia

[Free GPT-4]

[PDF] hal.science

SinKD: Sinkhorn Distance Minimization for Knowledge Distillation

X Cui, Y Qin, Y Gao, E Zhang, Z Xu… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

Knowledge distillation (KD) has been widely adopted to compress large language models
(LLMs). Existing KD methods investigate various divergence measures including the …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] aclanthology.org

[PDF][PDF] Predict the Next Word:< Humans Exhibit Uncertainty in this Task and Language Models _>

E Ilia, W Aziz - Proceedings of the 18th Conference of the …, 2024 - aclanthology.org

Abstract Language models (LMs) are statistical models trained to assign probability to
humangenerated text. As such, it is reasonable to question whether they approximate …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling

S Ren, Z Wu, KQ Zhu - arxiv preprint arxiv:2310.04691, 2023 - arxiv.org

Neural language models are probabilistic models of human text. They are predominantly
trained using maximum likelihood estimation (MLE), which is equivalent to minimizing the …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Transparency at the source: Evaluating and interpreting language models with access to the true distribution

J Jumelet, W Zuidema - arxiv preprint arxiv:2310.14840, 2023 - arxiv.org

We present a setup for training, evaluating and interpreting neural language models, that
uses artificial, language-like data. The data is generated using a massive probabilistic …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]

[PDF] neurips.cc

Beyond MLE: convex learning for text generation

C Shao, Z Ma, M Zhang, Y Feng - Advances in Neural …, 2023 - proceedings.neurips.cc

Maximum likelihood estimation (MLE) is a statistical method used to estimate the parameters
of a probability distribution that best explain the observed data. In the context of text …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 5 Wersja HTML

An improved two-stage zero-shot relation triplet extraction model with hybrid cross-entropy loss and discriminative reranking

D Li, L Zhang, J Zhou, J Huang, N **ong… - Expert Systems with …, 2025 - Elsevier

Zero-shot relation triplet extraction (ZeroRTE) aims to extract relation triplets from
unstructured text under zero-shot conditions, where the relation sets in the training and …

Zapisz Cytuj Powiązane artykuły

[Free GPT-4]

[PDF] arxiv.org

FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios

K Pang - arxiv preprint arxiv:2412.19652, 2024 - arxiv.org

Linguistic steganography embeds secret information in seemingly innocent texts,
safeguarding privacy in surveillance environments. Generative linguistic steganography …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Predict the Next Word

E Ilia, W Aziz - arxiv preprint arxiv:2402.17527, 2024 - arxiv.org

Language models (LMs) are statistical models trained to assign probability to human-
generated text. As such, it is reasonable to question whether they approximate linguistic …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Finding structure in language models

J Jumelet - arxiv preprint arxiv:2411.16433, 2024 - arxiv.org

When we speak, write or listen, we continuously make predictions based on our knowledge
of a language's grammar. Remarkably, children acquire this grammatical knowledge within …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 5 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Mixce: Training autoregressive language models by mixing forward and reverse cross-entropies

Contemporary approaches in evolving language models

SinKD: Sinkhorn Distance Minimization for Knowledge Distillation

[PDF][PDF] Predict the Next Word:< Humans Exhibit Uncertainty in this Task and Language Models _>

EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling

Transparency at the source: Evaluating and interpreting language models with access to the true distribution

Beyond MLE: convex learning for text generation

An improved two-stage zero-shot relation triplet extraction model with hybrid cross-entropy loss and discriminative reranking

FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios

Predict the Next Word

Finding structure in language models