„Google“ mokslinčius

L Hu, Z Liu, Z Zhao, L Hou, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-
supervised learning method, have yielded promising performance on various tasks in …

Išsaugoti Cituoti Cituoja 181 Susiję straipsniai Visos 10 versijos

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Probing classifiers: Promises, shortcomings, and advances

Y Belinkov - Computational Linguistics, 2022 - direct.mit.edu

Probing classifiers have emerged as one of the prominent methodologies for interpreting
and analyzing deep neural network models of natural language processing. The basic idea …

Išsaugoti Cituoti Cituoja 471 Susiję straipsniai Visos 9 versijos

[Free GPT-4]
[DeepSeek]

[HTML] google.com

[HTML][HTML] Modern language models refute Chomsky's approach to language

ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com

Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …

Išsaugoti Cituoti Cituoja 200 Susiję straipsniai Visos 5 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Leveraging large language models for multiple choice question answering

J Robinson, CM Rytting, D Wingate - arxiv preprint arxiv:2210.12353, 2022 - arxiv.org

While large language models (LLMs) like GPT-3 have achieved impressive results on
multiple choice question answering (MCQA) tasks in the zero, one, and few-shot settings …

Išsaugoti Cituoti Cituoja 168 Susiję straipsniai Visos 5 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Towards understanding cross and self-attention in stable diffusion for text-guided image editing

B Liu, C Wang, T Cao, K Jia… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Deep Text-to-Image Synthesis (TIS) models such as Stable Diffusion have recently
gained significant popularity for creative text-to-image generation. However for domain …

Išsaugoti Cituoti Cituoja 37 Susiję straipsniai Visos 7 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

Išsaugoti Cituoti Cituoja 953 Susiję straipsniai Visos 11 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Autoprompt: Eliciting knowledge from language models with automatically generated prompts

T Shin, Y Razeghi, RL Logan IV, E Wallace… - arxiv preprint arxiv …, 2020 - arxiv.org

The remarkable success of pretrained language models has motivated the study of what
kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the …

Išsaugoti Cituoti Cituoja 1933 Susiję straipsniai Visos 9 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A comprehensive study of knowledge editing for large language models

N Zhang, Y Yao, B Tian, P Wang, S Deng… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have shown extraordinary capabilities in understanding
and generating text that closely mirrors human communication. However, a primary …

Išsaugoti Cituoti Cituoja 88 Susiję straipsniai Visos 2 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Spot: Better frozen model adaptation through soft prompt transfer

T Vu, B Lester, N Constant, R Al-Rfou, D Cer - arxiv preprint arxiv …, 2021 - arxiv.org

There has been growing interest in parameter-efficient methods to apply pre-trained
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …

Išsaugoti Cituoti Cituoja 290 Susiję straipsniai Visos 11 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transformer feed-forward layers are key-value memories

M Geva, R Schuster, J Berant, O Levy - arxiv preprint arxiv:2012.14913, 2020 - arxiv.org

Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role
in the network remains under-explored. We show that feed-forward layers in transformer …

Išsaugoti Cituoti Cituoja 680 Susiję straipsniai Visos 6 versijos HTML kopija

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Linguistic knowledge and transferability of contextual representations

A survey of knowledge enhanced pre-trained language models

Probing classifiers: Promises, shortcomings, and advances

[HTML][HTML] Modern language models refute Chomsky's approach to language

Leveraging large language models for multiple choice question answering

Towards understanding cross and self-attention in stable diffusion for text-guided image editing

[HTML][HTML] Pre-trained models: Past, present and future

Autoprompt: Eliciting knowledge from language models with automatically generated prompts

A comprehensive study of knowledge editing for large language models

Spot: Better frozen model adaptation through soft prompt transfer

Transformer feed-forward layers are key-value memories