[HTML][HTML] From word embeddings to pre-trained language models: A state-of-the-art walkthrough

M Mars - Applied Sciences, 2022 - mdpi.com
With the recent advances in deep learning, different approaches to improving pre-trained
language models (PLMs) have been proposed. PLMs have advanced state-of-the-art …

Deep transfer learning & beyond: Transformer language models in information systems research

R Gruetzemacher, D Paradice - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
AI is widely thought to be poised to transform business, yet current perceptions of the scope
of this transformation may be myopic. Recent progress in natural language processing …

Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense

K Krishna, Y Song, M Karpinska… - Advances in Neural …, 2023 - proceedings.neurips.cc
The rise in malicious usage of large language models, such as fake content creation and
academic plagiarism, has motivated the development of approaches that identify AI …

A survey on rag meeting llms: Towards retrieval-augmented large language models

W Fan, Y Ding, L Ning, S Wang, H Li, D Yin… - Proceedings of the 30th …, 2024 - dl.acm.org
As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can
offer reliable and up-to-date external knowledge, providing huge convenience for numerous …

Improving the domain adaptation of retrieval augmented generation (RAG) models for open domain question answering

S Siriwardhana, R Weerasekera, E Wen… - Transactions of the …, 2023 - direct.mit.edu
Abstract Retrieval Augment Generation (RAG) is a recent advancement in Open-Domain
Question Answering (ODQA). RAG has only been trained and explored with a Wikipedia …

Videoclip: Contrastive pre-training for zero-shot video-text understanding

H Xu, G Ghosh, PY Huang, D Okhonko… - arxiv preprint arxiv …, 2021 - arxiv.org
We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot
video and text understanding, without using any labels on downstream tasks. VideoCLIP …

Retrieval-augmented multimodal language modeling

M Yasunaga, A Aghajanyan, W Shi, R James… - arxiv preprint arxiv …, 2022 - arxiv.org
Recent multimodal models such as DALL-E and CM3 have achieved remarkable progress
in text-to-image and image-to-text generation. However, these models store all learned …

mT5: A massively multilingual pre-trained text-to-text transformer

L Xue, N Constant, A Roberts, M Kale… - arxiv preprint arxiv …, 2020 - arxiv.org
The recent" Text-to-Text Transfer Transformer"(T5) leveraged a unified text-to-text format and
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …

Memorizing transformers

Y Wu, MN Rabe, DL Hutchins, C Szegedy - arxiv preprint arxiv …, 2022 - arxiv.org
Language models typically need to be trained or finetuned in order to acquire new
knowledge, which involves updating their weights. We instead envision language models …

Intrinsic dimensionality explains the effectiveness of language model fine-tuning

A Aghajanyan, L Zettlemoyer, S Gupta - arxiv preprint arxiv:2012.13255, 2020 - arxiv.org
Although pretrained language models can be fine-tuned to produce state-of-the-art results
for a very wide range of language understanding tasks, the dynamics of this process are not …