- Academic Search

G Team, M Riviere, S Pathak, PG Sessa… - arxiv preprint arxiv …, 2024 - arxiv.org

In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-
of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new …

Save Cite Cited by 321 Related articles All 4 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] arxiv.org

A review of sparse expert models in deep learning

W Fedus, J Dean, B Zoph - arxiv preprint arxiv:2209.01667, 2022 - arxiv.org

Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in
deep learning. This class of architecture encompasses Mixture-of-Experts, Switch …

Save Cite Cited by 145 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] jmlr.org

Scaling instruction-finetuned language models

HW Chung, L Hou, S Longpre, B Zoph, Y Tay… - Journal of Machine …, 2024 - jmlr.org

Finetuning language models on a collection of datasets phrased as instructions has been
shown to improve model performance and generalization to unseen tasks. In this paper we …

Save Cite Cited by 3281 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] jmlr.org

Palm: Scaling language modeling with pathways

A Chowdhery, S Narang, J Devlin, M Bosma… - Journal of Machine …, 2023 - jmlr.org

Large language models have been shown to achieve remarkable performance across a
variety of natural language tasks using few-shot learning, which drastically reduces the …

Save Cite Cited by 5602 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Gemma: Open models based on gemini research and technology

G Team, T Mesnard, C Hardin, R Dadashi… - arxiv preprint arxiv …, 2024 - arxiv.org

This work introduces Gemma, a family of lightweight, state-of-the art open models built from
the research and technology used to create Gemini models. Gemma models demonstrate …

Save Cite Cited by 917 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Scaling language-image pre-training via masking

Y Li, H Fan, R Hu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract We present Fast Language-Image Pre-training (FLIP), a simple and more efficient
method for training CLIP. Our method randomly masks out and removes a large portion of …

Save Cite Cited by 315 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Save Cite Cited by 684 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Pali: A jointly-scaled multilingual language-image model

X Chen, X Wang, S Changpinyo… - arxiv preprint arxiv …, 2022 - arxiv.org

Effective scaling and a flexible task interface enable large language models to excel at many
tasks. We present PaLI (Pathways Language and Image model), a model that extends this …

Save Cite Cited by 679 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense

K Krishna, Y Song, M Karpinska… - Advances in Neural …, 2024 - proceedings.neurips.cc

The rise in malicious usage of large language models, such as fake content creation and
academic plagiarism, has motivated the development of approaches that identify AI …

Save Cite Cited by 263 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Audiolm: a language modeling approach to audio generation

Z Borsos, R Marinier, D Vincent… - … ACM transactions on …, 2023 - ieeexplore.ieee.org

We introduce AudioLM, a framework for high-quality audio generation with long-term
consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts …

Save Cite Cited by 592 Related articles All 5 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Scaling up models and data with t5x and seqio

Gemma 2: Improving open language models at a practical size

A review of sparse expert models in deep learning

Scaling instruction-finetuned language models

Palm: Scaling language modeling with pathways

Gemma: Open models based on gemini research and technology

Scaling language-image pre-training via masking

Crosslingual generalization through multitask finetuning

Pali: A jointly-scaled multilingual language-image model

Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense

Audiolm: a language modeling approach to audio generation