Egoschema: A diagnostic benchmark for very long-form video language understanding K Mangalam, R Akshulakov, J Malik Advances in Neural Information Processing Systems 36, 2024 | 164 | 2024 |
Do Vision and Language Encoders Represent the World Similarly? M Maniparambil, R Akshulakov, YAD Djilali, M El Amine Seddik, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 6 | 2024 |
From Unimodal to Multimodal: Scaling up Projectors to Align Modalities M Maniparambil, R Akshulakov, YAD Djilali, S Narayan, A Singh, ... arXiv preprint arXiv:2409.19425, 2024 | | 2024 |
Do Vision and Language Encoders Represent the World Similarly? R Akshulakov | | 2024 |
Package analysis devices and systems M Kashi, K Mahmood, K Dornadula, SP Segu, R Akshulakov US Patent 11,900,312, 2024 | | 2024 |