Google znalac

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Spremi Citiraj Spominje se 493 puta Srodni članci Svih 4 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Understanding llms: A comprehensive overview from training to inference

Y Liu, H He, T Han, X Zhang, M Liu, J Tian, Y Zhang… - Neurocomputing, 2024 - Elsevier

The introduction of ChatGPT has led to a significant increase in the utilization of Large
Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on …

Spremi Citiraj Spominje se 83 puta Srodni članci Svih 6 inačica

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Dora: Weight-decomposed low-rank adaptation

SY Liu, CY Wang, H Yin, P Molchanov… - … on Machine Learning, 2024 - openreview.net

Among the widely used parameter-efficient fine-tuning (PEFT) methods, LoRA and its
variants have gained considerable popularity because of avoiding additional inference …

Spremi Citiraj Spominje se 297 puta Srodni članci Svih 7 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] biorxiv.org

Simulating 500 million years of evolution with a language model

T Hayes, R Rao, H Akin, NJ Sofroniew, D Oktay, Z Lin… - Science, 2025 - science.org

More than three billion years of evolution have produced an image of biology encoded into
the space of natural proteins. Here we show that language models trained at scale on …

Spremi Citiraj Spominje se 163 puta Srodni članci Svih 4 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

S Tong, E Brown, P Wu, S Woo, M Middepogu… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-
centric approach. While stronger language models can enhance multimodal capabilities, the …

Spremi Citiraj Spominje se 207 puta Srodni članci Svih 5 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sheared llama: Accelerating language model pre-training via structured pruning

M **a, T Gao, Z Zeng, D Chen - arxiv preprint arxiv:2310.06694, 2023 - arxiv.org

The popularity of LLaMA (Touvron et al., 2023a; b) and other recently emerged moderate-
sized large language models (LLMs) highlights the potential of building smaller yet powerful …

Spremi Citiraj Spominje se 239 puta Srodni članci Svih 6 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Yarn: Efficient context window extension of large language models

B Peng, J Quesnelle, H Fan, E Shippole - arxiv preprint arxiv:2309.00071, 2023 - arxiv.org

Rotary Position Embeddings (RoPE) have been shown to effectively encode positional
information in transformer-based language models. However, these models fail to …

Spremi Citiraj Spominje se 284 puta Srodni članci Svih 5 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Distrifusion: Distributed parallel inference for high-resolution diffusion models

M Li, T Cai, J Cao, Q Zhang, H Cai… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models have achieved great success in synthesizing high-quality images.
However generating high-resolution images with diffusion models is still challenging due to …

Spremi Citiraj Spominje se 43 puta Srodni članci Svih 8 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Olmo: Accelerating the science of language models

D Groeneveld, I Beltagy, P Walsh, A Bhagia… - arxiv preprint arxiv …, 2024 - arxiv.org

Language models (LMs) have become ubiquitous in both NLP research and in commercial
product offerings. As their commercial importance has surged, the most powerful models …

Spremi Citiraj Spominje se 159 puta Srodni članci Svih 7 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Autoregressive model beats diffusion: Llama for scalable image generation

P Sun, Y Jiang, S Chen, S Zhang, B Peng… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce LlamaGen, a new family of image generation models that apply original``next-
token prediction''paradigm of large language models to visual generation domain. It is an …

Spremi Citiraj Spominje se 132 puta Srodni članci Svih 3 inačica Prikaži kao HTML

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Pytorch fsdp: experiences on scaling fully sharded data parallel

Challenges and applications of large language models

Understanding llms: A comprehensive overview from training to inference

Dora: Weight-decomposed low-rank adaptation

Simulating 500 million years of evolution with a language model

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

Sheared llama: Accelerating language model pre-training via structured pruning

Yarn: Efficient context window extension of large language models

Distrifusion: Distributed parallel inference for high-resolution diffusion models

Olmo: Accelerating the science of language models

Autoregressive model beats diffusion: Llama for scalable image generation