- Academic Search

Y Wahba, N Madhavji, J Steinbacher - International Conference on …, 2022 - Springer

The emergence of pre-trained language models (PLMs) has shown great success in many
Natural Language Processing (NLP) tasks including text classification. Due to the minimal to …

保存引用被引用数: 27 関連記事全 7 バージョン

Less is more: Pruning BERTweet architecture in Twitter sentiment analysis

R Moura, J Carvalho, A Plastino, A Paes - Information Processing & …, 2024 - Elsevier

Transformer-based models have been scaled up to account for absorbing more information
and improve their performances. However, several studies have called attention to their …

保存引用被引用数: 3 関連記事

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tuning Language Models by Mixture-of-Depths Ensemble

H Luo, L Specia - arxiv preprint arxiv:2410.13077, 2024 - arxiv.org

Transformer-based Large Language Models (LLMs) traditionally rely on final-layer loss for
training and final-layer representations for predictions, potentially overlooking the predictive …

保存引用関連記事全 4 バージョン HTMLバージョン

Mitigating Hallucination Issues in Small-Parameter LLMs through Inter-Layer Contrastive Decoding

F Li, P Zhang - … Joint Conference on Neural Networks (IJCNN …, 2024 - ieeexplore.ieee.org

In this paper, we introduce a new decoding method to mitigate the issue of hallucinations in
Large Language Models (LLMs). Specifically, our method dynamically selects appropriate …

保存引用関連記事

アラートを作成

引用

検索オプション

マイライブラリに保存しました

BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple...

A comparison of svm against pre-trained language models (plms) for text classification tasks

Less is more: Pruning BERTweet architecture in Twitter sentiment analysis

Tuning Language Models by Mixture-of-Depths Ensemble

Mitigating Hallucination Issues in Small-Parameter LLMs through Inter-Layer Contrastive Decoding