RetroMAE: Pre-training retrieval-oriented language models via masked auto-encoder

S **ao, Z Liu, Y Shao, Z Cao - arxiv preprint arxiv:2205.12035, 2022 - arxiv.org
Despite pre-training's progress in many important NLP tasks, it remains to explore effective
pre-training strategies for dense retrieval. In this paper, we propose RetroMAE, a new …

Exploring the benefits of training expert language models over instruction tuning

J Jang, S Kim, S Ye, D Kim… - International …, 2023 - proceedings.mlr.press
Abstract Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known
as multitask-prompted fine-tuning (MT), have shown capabilities to generalize to unseen …

Angle-optimized text embeddings

X Li, J Li - arxiv preprint arxiv:2309.12871, 2023 - arxiv.org
High-quality text embedding is pivotal in improving semantic textual similarity (STS) tasks,
which are crucial components in Large Language Model (LLM) applications. However, a …

Simlm: Pre-training with representation bottleneck for dense passage retrieval

L Wang, N Yang, X Huang, B Jiao, L Yang… - arxiv preprint arxiv …, 2022 - arxiv.org
In this paper, we propose SimLM (Similarity matching with Language Model pre-training), a
simple yet effective pre-training method for dense passage retrieval. It employs a simple …

Improving contrastive learning of sentence embeddings from ai feedback

Q Cheng, X Yang, T Sun, L Li, X Qiu - arxiv preprint arxiv:2305.01918, 2023 - arxiv.org
Contrastive learning has become a popular approach in natural language processing,
particularly for the learning of sentence embeddings. However, the discrete nature of natural …

Rasa: Relation and sensitivity aware representation learning for text-based person search

Y Bai, M Cao, D Gao, Z Cao, C Chen, Z Fan… - arxiv preprint arxiv …, 2023 - arxiv.org
Text-based person search aims to retrieve the specified person images given a textual
description. The key to tackling such a challenging task is to learn powerful multi-modal …

Scaling sentence embeddings with large language models

T Jiang, S Huang, Z Luan, D Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have recently garnered significant interest. With in-context
learning, LLMs achieve impressive results in various natural language tasks. However, the …

Infocse: Information-aggregated contrastive learning of sentence embeddings

X Wu, C Gao, Z Lin, J Han, Z Wang, S Hu - arxiv preprint arxiv …, 2022 - arxiv.org
Contrastive learning has been extensively studied in sentence embedding learning, which
assumes that the embeddings of different views of the same sentence are closer. The …

CLSEP: Contrastive learning of sentence embedding with prompt

Q Wang, W Zhang, T Lei, Y Cao, D Peng… - Knowledge-Based …, 2023 - Elsevier
Sentence embedding, which aims to learn an effective representation of the sentence, is
beneficial for downstream tasks. By utilizing contrastive learning, most recent sentence …

Equivariant contrastive learning for sequential recommendation

P Zhou, J Gao, Y **e, Q Ye, Y Hua, J Kim… - Proceedings of the 17th …, 2023 - dl.acm.org
Contrastive learning (CL) benefits the training of sequential recommendation models with
informative self-supervision signals. Existing solutions apply general sequential data …