- Academic Search

Triplet knowledge distillation

Vlap: Efficient video-language alignment via frame prompting and distilling for video question answering

X Wang, J Liang, CK Wang, K Deng, Y Lou, MC Lin… - CoRR, 2023 - openreview.net

In this work, we propose an efficient Video-Language Alignment (ViLA) network. Our ViLA
model addresses both efficient frame sampling and effective cross-modal alignment in a …

保存引用被引用次数：5 相关文章所有 2 个版本网页快照

[Free GPT-4]

[PDF] amazon.science

Vila: Efficient video-language alignment for video question answering

X Wang, J Liang, CK Wang, K Deng, Y Lou… - … on Computer Vision, 2024 - Springer

We propose an efficient Vi deo-L anguage A lignment (ViLA) network. Our ViLA model
addresses both efficient frame sampling and effective cross-modal alignment in a unified …

保存引用被引用次数：3 相关文章所有 6 个版本

[Free GPT-4]

[PDF] mlr.press

Auxiliary modality learning with generalized curriculum distillation

Y Shen, X Wang, P Gao, M Lin - International Conference on …, 2023 - proceedings.mlr.press

Driven by the need from real-world applications, Auxiliary Modality Learning (AML) offers the
possibility to utilize more information from auxiliary data in training, while only requiring data …

保存引用被引用次数：2 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] proquest.com

Learning-Based Autonomous Driving With Enhanced Data Efficiency and Policy Training

Y Shen - 2023 - search.proquest.com

Autonomous vehicles are capable of sensing the environment and moving around with little
to no human intervention, enhancing efficiency and safety. Self-driving cars, for instance, will …

保存引用相关文章

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Triplet knowledge distillation

Vlap: Efficient video-language alignment via frame prompting and distilling for video question answering

Vila: Efficient video-language alignment for video question answering

Auxiliary modality learning with generalized curriculum distillation

Learning-Based Autonomous Driving With Enhanced Data Efficiency and Policy Training