Text algorithms in economics

E Ash, S Hansen - Annual Review of Economics, 2023 - annualreviews.org
This article provides an overview of the methods used for algorithmic text analysis in
economics, with a focus on three key contributions. First, we introduce methods for …

A survey of fake news: Fundamental theories, detection methods, and opportunities

X Zhou, R Zafarani - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
The explosive growth in fake news and its erosion to democracy, justice, and public trust has
increased the demand for fake news detection and intervention. This survey reviews and …

Improving text embeddings with large language models

L Wang, N Yang, X Huang, L Yang… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we introduce a novel and simple method for obtaining high-quality text
embeddings using only synthetic data and less than 1k training steps. Unlike existing …

Text embeddings by weakly-supervised contrastive pre-training

L Wang, N Yang, X Huang, B Jiao, L Yang… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a
wide range of tasks. The model is trained in a contrastive manner with weak supervision …

Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning

VW Liang, Y Zhang, Y Kwon… - Advances in Neural …, 2022 - proceedings.neurips.cc
We present modality gap, an intriguing geometric phenomenon of the representation space
of multi-modal models. Specifically, we show that different data modalities (eg images and …

A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence

EJ Allen, G St-Yves, Y Wu, JL Breedlove… - Nature …, 2022 - nature.com
Extensive sampling of neural activity during rich cognitive phenomena is critical for robust
understanding of brain function. Here we present the Natural Scenes Dataset (NSD), in …

Simcse: Simple contrastive learning of sentence embeddings

T Gao, X Yao, D Chen - arxiv preprint arxiv:2104.08821, 2021 - arxiv.org
This paper presents SimCSE, a simple contrastive learning framework that greatly advances
state-of-the-art sentence embeddings. We first describe an unsupervised approach, which …

On the sentence embeddings from pre-trained language models

B Li, H Zhou, J He, M Wang, Y Yang, L Li - arxiv preprint arxiv:2011.05864, 2020 - arxiv.org
Pre-trained contextual representations like BERT have achieved great success in natural
language processing. However, the sentence embeddings from the pre-trained language …

Contrastive learning for cold-start recommendation

Y Wei, X Wang, Q Li, L Nie, Y Li, X Li… - Proceedings of the 29th …, 2021 - dl.acm.org
Recommending purely cold-start items is a long-standing and fundamental challenge in the
recommender systems. Without any historical interaction on cold-start items, the …

Whitening sentence representations for better semantics and faster retrieval

J Su, J Cao, W Liu, Y Ou - arxiv preprint arxiv:2103.15316, 2021 - arxiv.org
Pre-training models such as BERT have achieved great success in many natural language
processing tasks. However, how to obtain better sentence representation through these pre …