One TTS alignment to rule them all R Badlani, A Łańcucki, KJ Shih, R Valle, W Ping, B Catanzaro ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 92 | 2022 |
Audio flamingo: A novel audio language model with few-shot learning and dialogue abilities Z Kong, A Goel, R Badlani, W Ping, R Valle, B Catanzaro arXiv preprint arXiv:2402.01831, 2024 | 65 | 2024 |
Content-based representations of audio using siamese neural networks P Manocha, R Badlani, A Kumar, A Shah, B Elizalde, B Raj 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 62 | 2018 |
Experiments on the DCASE challenge 2016: Acoustic scene classification and sound event detection in real life recording B Elizalde, A Kumar, A Shah, R Badlani, E Vincent, B Raj, I Lane arXiv preprint arXiv:1607.06706, 2016 | 59* | 2016 |
RAD-TTS: Parallel flow-based TTS with robust alignment learning and diverse synthesis KJ Shih, R Valle, R Badlani, A Lancucki, W Ping, B Catanzaro ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit …, 2021 | 55 | 2021 |
NELS-Never-Ending Learner of Sounds BR Benjamin Elizalde, Rohan Badlani, Ankit Shah, Anurag Kumar NIPS Workshop on Machine Learning for Audio, 2018 | 34* | 2018 |
An approach for self-training audio event detectors using web data B Elizalde, A Shah, S Dalmia, MH Lee, R Badlani, A Kumar, B Raj, I Lane 2017 25th European Signal Processing Conference (EUSIPCO), 1863-1867, 2017 | 30* | 2017 |
P-flow: a fast and data-efficient zero-shot TTS through speech prompting S Kim, K Shih, JF Santos, E Bakhturina, M Desta, R Valle, S Yoon, ... Advances in Neural Information Processing Systems 36, 2024 | 29 | 2024 |
Disambiguating sentiment: An ensemble of humour, sarcasm, and hate speech features for sentiment classification R Badlani, N Asnani, M Rai W-NUT 2019, 337-345, 2019 | 25* | 2019 |
RAD-MMM: Multilingual multiaccented multispeaker text to speech R Badlani, R Valle, KJ Shih, JF Santos, S Gururani, B Catanzaro Proc. Interspeech 2023, 626-630, 2023 | 13* | 2023 |
Generating and using joint representations of source code R Badlani, O Lewis, G Evangelopoulos, O Hatalsky, B Ni US Patent 11,169,786, 2021 | 11 | 2021 |
Framework for evaluation of sound event detection in web videos R Badlani, A Shah, B Elizalde, A Kumar, B Raj 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 8 | 2018 |
Improving robustness of llm-based speech synthesis by learning monotonic alignment P Neekhara, S Hussain, S Ghosh, J Li, R Valle, R Badlani, B Ginsburg arXiv preprint arXiv:2406.17957, 2024 | 6 | 2024 |
Relation extraction with contextualized relation embedding (CRE) X Chen, R Badlani arXiv preprint arXiv:2011.09658, 2020 | 5 | 2020 |
Pattern-based automatic parallelization of representative-based clustering algorithms S Islam, S Balasubramaniam, S Gupta, S Brajesh, R Badlani, ... 2018 IEEE 5th International Conference on Data Science and Advanced …, 2018 | 5 | 2018 |
Synthesizing video from audio using one or more neural networks MY Liu, K Nagano, S Yeongho, JRVG da Costa, SEO Jaewoo, TC Wang, ... US Patent App. 17/382,027, 2023 | 4 | 2023 |
VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation R Badlani, A Arora, S Ghosh, R Valle, KJ Shih, JF Santos, B Ginsburg, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 3 | 2023 |
Automatic parallelization of representative-based clustering algorithms for multicore cluster systems S Islam, S Balasubramaniam, S Gupta, S Brajesh, R Badlani, ... International Journal of Data Science and Analytics 10, 135-159, 2020 | 3 | 2020 |
Generative modeling for low dimensional speech attributes with neural spline flows KJ Shih, R Valle, R Badlani, JF Santos, B Catanzaro arXiv preprint arXiv:2203.01786, 2022 | 2 | 2022 |
DCASE challenge task 1 A Kumar, B Elizalde, A Shah, R Badlani, E Vincent, B Raj, I Lane Tech. Rep., DCASE2016 Challenge, 2016 | 2 | 2016 |