Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 1170 | 2024 |
Correcting time-continuous emotional labels by modeling the reaction lag of evaluators S Mariooryad, C Busso IEEE Transactions on Affective Computing 6 (2), 97-108, 2014 | 137 | 2014 |
Location-relative attention mechanisms for robust long-form speech synthesis E Battenberg, RJ Skerry-Ryan, S Mariooryad, D Stanton, D Kao, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 133 | 2020 |
Wave-tacotron: Spectrogram-free end-to-end text-to-speech synthesis RJ Weiss, RJ Skerry-Ryan, E Battenberg, S Mariooryad, DP Kingma ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 129 | 2021 |
Analysis and compensation of the reaction lag of evaluators in continuous emotional annotations S Mariooryad, C Busso 2013 humaine association conference on affective computing and intelligent …, 2013 | 83 | 2013 |
Exploring cross-modality affective reactions for audiovisual emotion recognition S Mariooryad, C Busso IEEE Transactions on affective computing 4 (2), 183-196, 2013 | 83 | 2013 |
Compensating for speaker or lexical variabilities in speech for emotion recognition S Mariooryad, C Busso Speech Communication 57, 1-12, 2014 | 80 | 2014 |
Iterative feature normalization scheme for automatic emotion detection from speech C Busso, S Mariooryad, A Metallinou, S Narayanan IEEE transactions on Affective computing 4 (4), 386-397, 2013 | 79 | 2013 |
Generating human-like behaviors using joint, speech-driven models for conversational agents S Mariooryad, C Busso IEEE Transactions on Audio, Speech, and Language Processing 20 (8), 2329-2340, 2012 | 69 | 2012 |
Building a naturalistic emotional speech corpus by retrieving expressive behaviors from existing speech corpora. S Mariooryad, R Lotfian, C Busso Interspeech, 238-242, 2014 | 67 | 2014 |
Semi-supervised generative modeling for controllable speech synthesis R Habib, S Mariooryad, M Shannon, E Battenberg, RJ Skerry-Ryan, ... arXiv preprint arXiv:1910.01709, 2019 | 60 | 2019 |
Effective use of variational embedding capacity in expressive end-to-end speech synthesis E Battenberg, S Mariooryad, D Stanton, RJ Skerry-Ryan, M Shannon, ... arXiv preprint arXiv:1906.03402, 2019 | 58 | 2019 |
Speaker generation D Stanton, M Shannon, S Mariooryad, RJ Skerry-Ryan, E Battenberg, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 36 | 2022 |
Audiovisual corpus to analyze whisper speech T Tran, S Mariooryad, C Busso 2013 IEEE International Conference on Acoustics, Speech and Signal …, 2013 | 35 | 2013 |
Facial expression recognition in the presence of speech using blind lexical compensation S Mariooryad, C Busso IEEE Transactions on Affective Computing 7 (4), 346-359, 2015 | 33 | 2015 |
Touchless user interface navigation using gestures RL Carceroni, PR Sanketi, S Shah, D Ozkan, S Mariooryad, SMS Tarzjani, ... US Patent 9,804,679, 2017 | 32 | 2017 |
Spoken question answering and speech continuation using spectrogram-powered llm E Nachmani, A Levkovitch, R Hirsch, J Salazar, C Asawaroengchai, ... arXiv preprint arXiv:2305.15255, 2023 | 31 | 2023 |
The cost of dichotomizing continuous labels for binary classification problems: Deriving a Bayesian-optimal classifier S Mariooryad, C Busso IEEE Transactions on Affective Computing 8 (1), 119-130, 2015 | 29 | 2015 |
Automatic characterization of speaking styles in educational videos S Mariooryad, A Kannan, D Hakkani-Tür, E Shriberg 2014 IEEE International Conference on Acoustics, Speech and Signal …, 2014 | 23 | 2014 |
Feature and model level compensation of lexical content for facial emotion recognition S Mariooryad, C Busso 2013 10th IEEE International Conference and Workshops on Automatic Face and …, 2013 | 23 | 2013 |