Google 학술 검색

학술자료

학술검색

검색결과 약 1,177개 (0.03초)

내 프로필 내 서재

Frozen in time: A joint video and image encoder for end-to-end retrieval

인용 문서 내에서 검색

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, J Dong, C Li, D Su, C Chu… - ar** an end-to-end chat-centric video
understanding system, coined as VideoChat. It integrates video foundation models and …

저장 인용 590회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Videomamba: State space model for efficient video understanding

K Li, X Li, Y Wang, Y He, Y Wang, L Wang… - European Conference on …, 2024 - Springer

Addressing the dual challenges of local redundancy and global dependencies in video
understanding, this work innovatively adapts the Mamba to the video domain. The proposed …

저장 인용 150회 인용 관련 학술자료 전체 2개의 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Frozen in time: A joint video and image encoder for end-to-end retrieval

Mm-llms: Recent advances in multimodal large language models

Videomamba: State space model for efficient video understanding