- Academic Search

M Hassanin, S Anwar, I Radwan, FS Khan, A Mian - Information Fusion, 2024 - Elsevier

Inspired by the human cognitive system, attention is a mechanism that imitates the human
cognitive awareness about specific information, amplifying critical details to focus more on …

Gem Citer Citeret af 171 Relaterede artikler Alle 9 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Training dynamics of multi-head softmax attention for in-context learning: Emergence, convergence, and optimality

S Chen, H Sheen, T Wang, Z Yang - arxiv preprint arxiv:2402.19442, 2024 - arxiv.org

We study the dynamics of gradient flow for training a multi-head softmax attention model for
in-context learning of multi-task linear regression. We establish the global convergence of …

Gem Citer Citeret af 39 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Few-shot named entity recognition: An empirical baseline study

J Huang, C Li, K Subudhi, D Jose… - Proceedings of the …, 2021 - aclanthology.org

This paper presents an empirical study to efficiently build named entity recognition (NER)
systems when a small amount of in-domain labeled data is available. Based upon recent …

Gem Citer Citeret af 106 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Code structure–guided transformer for source code summarization

S Gao, C Gao, Y He, J Zeng, L Nie, X **a… - ACM Transactions on …, 2023 - dl.acm.org

Code summaries help developers comprehend programs and reduce their time to infer the
program functionalities during software maintenance. Recent efforts resort to deep learning …

Gem Citer Citeret af 103 Relaterede artikler Alle 8 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Few-shot named entity recognition: A comprehensive study

J Huang, C Li, K Subudhi, D Jose… - arxiv preprint arxiv …, 2020 - arxiv.org

This paper presents a comprehensive study to efficiently build named entity recognition
(NER) systems when a small number of in-domain labeled data is available. Based upon …

Gem Citer Citeret af 94 Relaterede artikler Alle 3 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Unraveling attention via convex duality: Analysis and interpretations of vision transformers

A Sahiner, T Ergen, B Ozturkler… - International …, 2022 - proceedings.mlr.press

Vision transformers using self-attention or its proposed alternatives have demonstrated
promising results in many image related tasks. However, the underpinning inductive bias of …

Gem Citer Citeret af 31 Relaterede artikler Alle 10 versioner Vis som HTML

Combining external-latent attention for medical image segmentation

E Song, B Zhan, H Liu - Neural Networks, 2024 - Elsevier

The attention mechanism comes as a new entry point for improving the performance of
medical image segmentation. How to reasonably assign weights is a key element of the …

Gem Citer Citeret af 13 Relaterede artikler Alle 4 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Balancing speciality and versatility: a coarse to fine framework for supervised fine-tuning large language model

H Zhang, Y Wu, D Li, S Yang, R Zhao, Y Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org

Aligned Large Language Models (LLMs) showcase remarkable versatility, capable of
handling diverse real-world tasks. Meanwhile, aligned LLMs are also expected to exhibit …

Gem Citer Citeret af 8 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Superiority of multi-head attention in in-context linear regression

Y Cui, J Ren, P He, J Tang, Y **ng - arxiv preprint arxiv:2401.17426, 2024 - arxiv.org

We present a theoretical analysis of the performance of transformer with softmax attention in
in-context learning with linear regression tasks. While the existing literature predominantly …

Gem Citer Citeret af 15 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring predictive uncertainty and calibration in NLP: A study on the impact of method & data scarcity

D Ulmer, J Frellsen, C Hardmeier - arxiv preprint arxiv:2210.15452, 2022 - arxiv.org

We investigate the problem of determining the predictive confidence (or, conversely,
uncertainty) of a neural classifier through the lens of low-resource languages. By training …

Gem Citer Citeret af 23 Relaterede artikler Alle 6 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Repulsive attention: Rethinking multi-head attention as bayesian inference

Visual attention methods in deep learning: An in-depth survey

Training dynamics of multi-head softmax attention for in-context learning: Emergence, convergence, and optimality

Few-shot named entity recognition: An empirical baseline study

Code structure–guided transformer for source code summarization

Few-shot named entity recognition: A comprehensive study

Unraveling attention via convex duality: Analysis and interpretations of vision transformers

Combining external-latent attention for medical image segmentation

Balancing speciality and versatility: a coarse to fine framework for supervised fine-tuning large language model

Superiority of multi-head attention in in-context linear regression

Exploring predictive uncertainty and calibration in NLP: A study on the impact of method & data scarcity