Google Наука

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Запазване Позоваване С позовавания в 452 Сродни статии Всички 8 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Vision transformers for remote sensing image classification

Y Bazi, L Bashmal, MMA Rahhal, RA Dayil, NA Ajlan - Remote Sensing, 2021 - mdpi.com

In this paper, we propose a remote-sensing scene-classification method based on vision
transformers. These types of networks, which are now recognized as state-of-the-art models …

Запазване Позоваване С позовавания в 471 Сродни статии Всички 6 версии Кеширана версия

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition

Z Gao, S Zhang, I McLoughlin, Z Yan - arxiv preprint arxiv:2206.08317, 2022 - arxiv.org

Transformers have recently dominated the ASR field. Although able to yield good
performance, they involve an autoregressive (AR) decoder to generate tokens one by one …

Запазване Позоваване С позовавания в 102 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on non-autoregressive generation for neural machine translation and beyond

Y **ao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Запазване Позоваване С позовавания в 94 Сродни статии Всички 9 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Joint entity and relation extraction with set prediction networks

D Sui, X Zeng, Y Chen, K Liu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

Joint entity and relation extraction is an important task in natural language processing, which
aims to extract all relational triples mentioned in a given sentence. In essence, the relational …

Запазване Позоваване С позовавания в 211 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Vision–language model for visual question answering in medical imagery

Y Bazi, MMA Rahhal, L Bashmal, M Zuair - Bioengineering, 2023 - mdpi.com

In the clinical and healthcare domains, medical images play a critical role. A mature medical
visual question answering system (VQA) can improve diagnosis by answering clinical …

Запазване Позоваване С позовавания в 61 Сродни статии Всички 8 версии Кеширана версия

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Intermediate loss regularization for ctc-based speech recognition

J Lee, S Watanabe - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org

We present a simple and efficient auxiliary loss function for automatic speech recognition
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …

Запазване Позоваване С позовавания в 159 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mask CTC: Non-autoregressive end-to-end ASR with CTC and mask predict

Y Higuchi, S Watanabe, N Chen, T Ogawa… - arxiv preprint arxiv …, 2020 - arxiv.org

We present Mask CTC, a novel non-autoregressive end-to-end automatic speech
recognition (ASR) framework, which generates a sequence by refining outputs of the …

Запазване Позоваване С позовавания в 154 Сродни статии Всички 10 версии Във вид на HTML

Vision Transformer‐based recognition of diabetic retinopathy grade

J Wu, R Hu, Z **ao, J Chen, J Liu - Medical Physics, 2021 - Wiley Online Library

Background In the domain of natural language processing, Transformers are recognized as
state‐of‐the‐art models, which opposing to typical convolutional neural networks (CNNs) do …

Запазване Позоваване С позовавания в 91 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Imputer: Sequence modelling via imputation and dynamic programming

W Chan, C Saharia, G Hinton… - International …, 2020 - proceedings.mlr.press

This paper presents the Imputer, a neural sequence model that generates output sequences
iteratively via imputations. The Imputer is an iterative generation model, requiring only a …

Запазване Позоваване С позовавания в 134 Сродни статии Всички 9 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Non-autoregressive transformer for speech recognition

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

Vision transformers for remote sensing image classification

Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition

A survey on non-autoregressive generation for neural machine translation and beyond

Joint entity and relation extraction with set prediction networks

Vision–language model for visual question answering in medical imagery

Intermediate loss regularization for ctc-based speech recognition

Mask CTC: Non-autoregressive end-to-end ASR with CTC and mask predict

Vision Transformer‐based recognition of diabetic retinopathy grade

Imputer: Sequence modelling via imputation and dynamic programming