- Academic Search

Y Zhang, B Colman, X Guo, A Shahriyari… - European Conference on …, 2024 - Springer

State-of-the-art deepfake detection approaches rely on image-based features extracted via
neural networks. While these approaches trained in a supervised manner extract likely fake …

Zapisz Cytuj Cytowane przez 16 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - ar** Language-Guided Navigation Learning with Self-Refining Data Flywheel

Z Wang, J Li, Y Hong, S Li, K Li, S Yu, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Creating high-quality data for training robust language-instructed agents is a long-lasting
challenge in embodied AI. In this paper, we introduce a Self-Refining Data Flywheel (SRDF) …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

To Ask or Not to Ask? Detecting Absence of Information in Vision and Language Navigation

SS Abraham, S Garg, F Dayoub - arxiv preprint arxiv:2411.05831, 2024 - arxiv.org

Recent research in Vision Language Navigation (VLN) has overlooked the development of
agents' inquisitive abilities, which allow them to ask clarifying questions when instructions …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Spatial and Temporal Language Understanding: Representation, Reasoning, and Grounding

P Kordjamshidi, Q Ning, J Pustejovsky… - Proceedings of the …, 2024 - aclanthology.org

This tutorial provides an overview of the cutting edge research on spatial and temporal
language understanding. We also cover some essential background material from various …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

NavHint: Vision and language navigation agent with a hint generator

Common sense reasoning for deepfake detection

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

To Ask or Not to Ask? Detecting Absence of Information in Vision and Language Navigation

Spatial and Temporal Language Understanding: Representation, Reasoning, and Grounding