Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Vision transformers for remote sensing image classification
In this paper, we propose a remote-sensing scene-classification method based on vision
transformers. These types of networks, which are now recognized as state-of-the-art models …
transformers. These types of networks, which are now recognized as state-of-the-art models …
Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition
Transformers have recently dominated the ASR field. Although able to yield good
performance, they involve an autoregressive (AR) decoder to generate tokens one by one …
performance, they involve an autoregressive (AR) decoder to generate tokens one by one …
A survey on non-autoregressive generation for neural machine translation and beyond
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …
(NMT) to speed up inference, has attracted much attention in both machine learning and …
Joint entity and relation extraction with set prediction networks
Joint entity and relation extraction is an important task in natural language processing, which
aims to extract all relational triples mentioned in a given sentence. In essence, the relational …
aims to extract all relational triples mentioned in a given sentence. In essence, the relational …
Vision–language model for visual question answering in medical imagery
In the clinical and healthcare domains, medical images play a critical role. A mature medical
visual question answering system (VQA) can improve diagnosis by answering clinical …
visual question answering system (VQA) can improve diagnosis by answering clinical …
Intermediate loss regularization for ctc-based speech recognition
We present a simple and efficient auxiliary loss function for automatic speech recognition
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …
Mask CTC: Non-autoregressive end-to-end ASR with CTC and mask predict
We present Mask CTC, a novel non-autoregressive end-to-end automatic speech
recognition (ASR) framework, which generates a sequence by refining outputs of the …
recognition (ASR) framework, which generates a sequence by refining outputs of the …
Vision Transformer‐based recognition of diabetic retinopathy grade
J Wu, R Hu, Z **ao, J Chen, J Liu - Medical Physics, 2021 - Wiley Online Library
Background In the domain of natural language processing, Transformers are recognized as
state‐of‐the‐art models, which opposing to typical convolutional neural networks (CNNs) do …
state‐of‐the‐art models, which opposing to typical convolutional neural networks (CNNs) do …
Imputer: Sequence modelling via imputation and dynamic programming
This paper presents the Imputer, a neural sequence model that generates output sequences
iteratively via imputations. The Imputer is an iterative generation model, requiring only a …
iteratively via imputations. The Imputer is an iterative generation model, requiring only a …