- Academic Search

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

Save Cite Cited by 1453 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Recent developments on espnet toolkit boosted by conformer

P Guo, F Boyer, X Chang, T Hayashi… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

In this study, we present recent developments on ESPnet: End-to-End Speech Processing
toolkit, which mainly involves a recently proposed architecture called Conformer …

Save Cite Cited by 304 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Conformer: Convolution-augmented transformer for speech recognition

A Gulati, J Qin, CC Chiu, N Parmar, Y Zhang… - arxiv preprint arxiv …, 2020 - arxiv.org

Recently Transformer and Convolution neural network (CNN) based models have shown
promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural …

Save Cite Cited by 3528 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

On layer normalization in the transformer architecture

R **ong, Y Yang, D He, K Zheng… - International …, 2020 - proceedings.mlr.press

The Transformer is widely used in natural language processing tasks. To train a Transformer
however, one usually needs a carefully designed learning rate warm-up stage, which is …

Save Cite Cited by 1099 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Squeezeformer: An efficient transformer for automatic speech recognition

S Kim, A Gholami, A Shaw, N Lee… - Advances in …, 2022 - proceedings.neurips.cc

The recently proposed Conformer model has become the de facto backbone model for
various downstream speech tasks based on its hybrid attention-convolution architecture that …

Save Cite Cited by 132 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aclanthology.org

Understanding the difficulty of training transformers

L Liu, X Liu, J Gao, W Chen, J Han - arxiv preprint arxiv:2004.08249, 2020 - arxiv.org

Transformers have proved effective in many NLP tasks. However, their training requires non-
trivial efforts regarding designing cutting-edge optimizers and learning rate schedulers …

Save Cite Cited by 296 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] fbk.eu

Findings of the IWSLT 2022 Evaluation Campaign.

A Anastasopoulos, L Barrault, L Bentivogli… - Proceedings of the 19th …, 2022 - cris.fbk.eu

The evaluation campaign of the 19th International Conference on Spoken Language
Translation featured eight shared tasks:(i) Simultaneous speech translation,(ii) Offline …

Save Cite Cited by 112 Related articles All 17 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

The emergence of clusters in self-attention dynamics

B Geshkovski, C Letrouit… - Advances in Neural …, 2024 - proceedings.neurips.cc

Viewing Transformers as interacting particle systems, we describe the geometry of learned
representations when the weights are not time-dependent. We show that particles …

Save Cite Cited by 44 Related articles All 11 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Energy transformer

B Hoover, Y Liang, B Pham, R Panda… - Advances in …, 2024 - proceedings.neurips.cc

Our work combines aspects of three promising paradigms in machine learning, namely,
attention mechanism, energy-based models, and associative memory. Attention is the power …

Save Cite Cited by 49 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nsf.gov

[PDF][PDF] Nasvit: Neural architecture search for efficient vision transformers with gradient conflict-aware supernet training

C Gong, D Wang - ICLR Proceedings 2022, 2022 - par.nsf.gov

Designing accurate and efficient vision transformers (ViTs) is an important but challenging
task. Supernet-based one-shot neural architecture search (NAS) enables fast architecture …

Save Cite Cited by 89 Related articles All 2 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Understanding and improving transformer from a multi-particle dynamic system point of view

[HTML][HTML] A survey of transformers

Recent developments on espnet toolkit boosted by conformer

Conformer: Convolution-augmented transformer for speech recognition

On layer normalization in the transformer architecture

Squeezeformer: An efficient transformer for automatic speech recognition

Understanding the difficulty of training transformers

Findings of the IWSLT 2022 Evaluation Campaign.

The emergence of clusters in self-attention dynamics

Energy transformer

[PDF][PDF] Nasvit: Neural architecture search for efficient vision transformers with gradient conflict-aware supernet training