Preserving integrity in online social networks

A Halevy, C Canton-Ferrer, H Ma, U Ozertem… - Communications of the …, 2022 - dl.acm.org
Preserving integrity in online social networks Page 1 92 COMMUNICATIONS OF THE ACM |
FEBRUARY 2022 | VOL. 65 | NO. 2 review articles THE GOAL OF online social networks is to …

Do vision transformers see like convolutional neural networks?

M Raghu, T Unterthiner, S Kornblith… - Advances in neural …, 2021 - proceedings.neurips.cc
Convolutional neural networks (CNNs) have so far been the de-facto model for visual data.
Recent work has shown that (Vision) Transformer models (ViT) can achieve comparable or …

On the cross-lingual transferability of monolingual representations

M Artetxe, S Ruder, D Yogatama - arxiv preprint arxiv:1910.11856, 2019 - arxiv.org
State-of-the-art unsupervised multilingual models (eg, multilingual BERT) have been shown
to generalize in a zero-shot cross-lingual setting. This generalization ability has been …

[PDF][PDF] Unsupervised cross-lingual representation learning at scale

A Conneau - arxiv preprint arxiv:1911.02116, 2019 - fq.pkwyx.com
This paper shows that pretraining multilingual language models at scale leads to significant
performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer …

InfoXLM: An information-theoretic framework for cross-lingual language model pre-training

Z Chi, L Dong, F Wei, N Yang, S Singhal… - arxiv preprint arxiv …, 2020 - arxiv.org
In this work, we present an information-theoretic framework that formulates cross-lingual
language model pre-training as maximizing mutual information between multilingual-multi …

Do wide and deep networks learn the same things? uncovering how neural network representations vary with width and depth

T Nguyen, M Raghu, S Kornblith - arxiv preprint arxiv:2010.15327, 2020 - arxiv.org
A key factor in the success of deep neural networks is the ability to scale models to improve
performance by varying the architecture depth and width. This simple property of neural …

Probing pretrained language models for lexical semantics

I Vulić, EM Ponti, R Litschko, G Glavaš… - Proceedings of the …, 2020 - aclanthology.org
The success of large pretrained language models (LMs) such as BERT and RoBERTa has
sparked interest in probing their representations, in order to unveil what types of knowledge …

Are all languages created equal in multilingual BERT?

S Wu, M Dredze - arxiv preprint arxiv:2005.09093, 2020 - arxiv.org
Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-
lingual performance on several NLP tasks, even without explicit cross-lingual signals …

From zero to hero: On the limitations of zero-shot cross-lingual transfer with multilingual transformers

A Lauscher, V Ravishankar, I Vulić… - arxiv preprint arxiv …, 2020 - arxiv.org
Massively multilingual transformers pretrained with language modeling objectives (eg,
mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross …

mslam: Massively multilingual joint pre-training for speech and text

A Bapna, C Cherry, Y Zhang, Y Jia, M Johnson… - arxiv preprint arxiv …, 2022 - arxiv.org
We present mSLAM, a multilingual Speech and LAnguage Model that learns cross-lingual
cross-modal representations of speech and text by pre-training jointly on large amounts of …