Παρακολούθηση
Bowen Shi
Bowen Shi
Facebook AI Research
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα meta.com
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
3322022
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
Journal of Machine Learning Research 25 (97), 1-52, 2024
2982024
Voicebox: Text-guided multilingual universal speech generation at scale
M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ...
Advances in neural information processing systems 36, 2024
2552024
Scaling autoregressive multi-modal models: Pretraining and instruction tuning
L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ...
arXiv preprint arXiv:2309.02591 2 (3), 2023
1372023
Robust self-supervised audio-visual speech recognition
B Shi, WN Hsu, A Mohamed
arXiv preprint arXiv:2201.01763, 2022
1262022
Comparative layer-wise analysis of self-supervised speech models
A Pasad, B Shi, K Livescu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
1152023
American sign language fingerspelling recognition in the wild
B Shi, AM Del Rio, J Keane, J Michaux, D Brentari, G Shakhnarovich, ...
2018 IEEE Spoken Language Technology Workshop (SLT), 145-152, 2018
962018
Audiobox: Unified audio generation with natural language prompts
A Vyas, B Shi, M Le, A Tjandra, YC Wu, B Guo, J Zhang, X Zhang, ...
arXiv preprint arXiv:2312.15821, 2023
912023
Offloading guidelines for augmented reality applications on wearable devices
B Shi, J Yang, Z Huang, P Hui
Proceedings of the 23rd ACM international conference on Multimedia, 1271-1274, 2015
892015
Fingerspelling recognition in the wild with iterative visual attention
B Shi, AMD Rio, J Keane, D Brentari, G Shakhnarovich, K Livescu
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
882019
Few-shot acoustic event detection via meta learning
B Shi, M Sun, KC Puvvada, CC Kao, S Matsoukas, C Wang
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
762020
Open-domain sign language translation learned from online video
B Shi, D Brentari, G Shakhnarovich, K Livescu
arXiv preprint arXiv:2205.12870, 2022
542022
Expresso: A benchmark and analysis of discrete expressive speech resynthesis
TA Nguyen, WN Hsu, A d'Avirro, B Shi, I Gat, M Fazel-Zarani, T Remez, ...
arXiv preprint arXiv:2308.05725, 2023
512023
A cross-task analysis of text span representations
S Toshniwal, H Shi, B Shi, L Gao, K Livescu, K Gimpel
arXiv preprint arXiv:2006.03866, 2020
442020
u-hubert: Unified mixed-modal speech pretraining and zero-shot transfer to unlabeled modality
WN Hsu, B Shi
Advances in Neural Information Processing Systems 35, 21157-21170, 2022
432022
Fingerspelling detection in american sign language
B Shi, D Brentari, G Shakhnarovich, K Livescu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
352021
Muavic: A multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation
M Anwar, B Shi, V Goswami, WN Hsu, J Pino, C Wang
arXiv preprint arXiv:2303.00628, 2023
342023
Revise: Self-supervised speech resynthesis with visual input for universal and generalized speech regeneration
WN Hsu, T Remez, B Shi, J Donley, Y Adi
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
31*2023
Generative pre-training for speech with flow matching
AH Liu, M Le, A Vyas, B Shi, A Tjandra, WN Hsu
arXiv preprint arXiv:2310.16338, 2023
262023
Semi-supervised acoustic event detection based on tri-training
B Shi, M Sun, CC Kao, V Rozgic, S Matsoukas, C Wang
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
262019
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20