Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... Journal of machine learning research 21 (140), 1-67, 2020 | 21087 | 2020 |
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 1557 | 2023 |
Exploring the limits of transfer learning with a unified text-to-text transformer (2019) C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... arXiv preprint arXiv:1910.10683, 2020 | 249* | 2020 |
A streaming on-device end-to-end model surpassing server-side conventional model quality and latency TN Sainath, Y He, B Li, A Narayanan, R Pang, A Bruguier, S Chang, W Li, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 235 | 2020 |
Monotonic infinite lookback attention for simultaneous machine translation N Arivazhagan, C Cherry, W Macherey, CC Chiu, S Yavuz, R Pang, W Li, ... arXiv preprint arXiv:1906.05218, 2019 | 201 | 2019 |
Two-pass end-to-end speech recognition TN Sainath, R Pang, D Rybach, Y He, R Prabhavalkar, W Li, M Visontai, ... arXiv preprint arXiv:1908.10992, 2019 | 173 | 2019 |
Delving into out-of-distribution detection with vision-language representations Y Ming, Z Cai, J Gu, Y Sun, W Li, Y Li Advances in neural information processing systems 35, 35087-35102, 2022 | 162 | 2022 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 111 | 2021 |
Streaming small-footprint keyword spotting using sequence-to-sequence models Y He, R Prabhavalkar, K Rao, W Li, A Bakhtin, I McGraw 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017 | 105 | 2017 |
VoiceFilter-Lite: Streaming targeted voice separation for on-device speech recognition Q Wang, IL Moreno, M Saglam, K Wilson, A Chiao, R Liu, Y He, W Li, ... arXiv preprint arXiv:2009.04323, 2020 | 103 | 2020 |
Tied & reduced rnn-t decoder R Botros, TN Sainath, R David, E Guzman, W Li, Y He arXiv preprint arXiv:2109.07513, 2021 | 60 | 2021 |
PaLM 2 Technical Report; 2023 R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 55 | 2023 |
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling. TN Sainath, Y He, A Narayanan, R Botros, R Pang, D Rybach, C Allauzen, ... Interspeech 8, 1777-1781, 2021 | 48 | 2021 |
Video instruction tuning with synthetic data Y Zhang, J Wu, W Li, B Li, Z Ma, Z Liu, C Li arXiv preprint arXiv:2410.02713, 2024 | 36 | 2024 |
Learning word-level confidence for subword end-to-end automatic speech recognition D Qiu, Q Li, Y He, Y Zhang, B Li, L Cao, R Prabhavalkar, D Bhatia, W Li, ... US Patent 11,610,586, 2023 | 31 | 2023 |
Learning word-level confidence for subword end-to-end ASR D Qiu, Q Li, Y He, Y Zhang, B Li, L Cao, R Prabhavalkar, D Bhatia, W Li, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 31 | 2021 |
Mammut: A simple architecture for joint learning for multimodal tasks W Kuo, AJ Piergiovanni, D Kim, X Luo, B Caine, W Li, A Ogale, L Zhou, ... arXiv preprint arXiv:2303.16839, 2023 | 29 | 2023 |
Answer-me: Multi-task open-vocabulary visual question answering AJ Piergiovanni, W Li, W Kuo, M Saffar, F Bertsch, A Angelova arXiv preprint arXiv:2205.00949, 2022 | 20 | 2022 |
Low Latency Speech Recognition Using End-to-End Prefetching. SY Chang, B Li, D Rybach, Y He, W Li, TN Sainath, T Strohman Interspeech, 1962-1966, 2020 | 20 | 2020 |
Findit: Generalized localization with natural language queries W Kuo, F Bertsch, W Li, AJ Piergiovanni, M Saffar, A Angelova European Conference on Computer Vision, 502-520, 2022 | 17 | 2022 |