- Academic Search

Joint learning for relationship and interaction analysis in video with multimodal feature fusion

Szukaj w artykułach zawierających cytaty

RETRACTED ARTICLE: ICDN: integrating consistency and difference networks by transformer for multimodal sentiment analysis

Q Zhang, L Shi, P Liu, Z Zhu, L Xu - Applied Intelligence, 2023 - Springer

The sentiment of human language is usually reflected through multimodal forms such as
natural language, facial expression, and voice intonation. However, the previous research …

Zapisz Cytuj Cytowane przez 39 Powiązane artykuły Wszystkie wersje 3

TSPNet: Translation supervised prototype network via residual learning for multimodal social relation extraction

H Kang, X Li, L **, C Liu, Z Zhang, S Li, Y Zhang - Neurocomputing, 2022 - Elsevier

Multimodal social relation extraction requires sufficient features fusion to identify the relation
between different targets. Compared with traditional multimodal social relation extraction …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Two stage multi-modal modeling for video interaction analysis in deep video understanding challenge

S Sun, X **ong, Y Zheng - Proceedings of the 30th ACM International …, 2022 - dl.acm.org

Interaction understanding between different entities in human-centered movie video is
receiving more and more attention. Recently, a deep video understanding (DVU) task is …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] archive.org

Shifted GCN-GAT and Cumulative-Transformer based Social Relation Recognition for Long Videos

H Wang, Y Hu, Y Zhu, J Qi, B Wu - Proceedings of the 31st ACM …, 2023 - dl.acm.org

Social Relation Recognition is an important part of Video Understanding, providing insights
into the information that videos convey. Most previous works mainly focused on graph …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2

Multimodal analysis for deep video understanding with video language transformer

B Zhang, Y Fang, T Ren, G Wu - Proceedings of the 30th ACM …, 2022 - dl.acm.org

The Deep Video Understanding Challenge (DVUC) is aimed to use multiple modality
information to build high-level understanding of video, involving tasks such as relationship …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 2

Multimodal early fusion operators for temporal video scene segmentation tasks

AAR Beserra, R Goularte - Multimedia Tools and Applications, 2023 - Springer

Abstract The Temporal Video Scene Segmentation (TVSS) task is still an open problem
presenting challenges in the Multimedia Analysis area. Current approaches employ …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 4

[Free GPT-4]
[DeepSeek]

[PDF] ssrn.com

MT-TCCT: Multi-task learning for multimodal emotion recognition

Y Wang, Z Chen, S Chen, Y Zhu - International Conference on Artificial …, 2022 - Springer

Multimodal emotion recognition is an emerging research field, which aims to capture
affective information from multimodal data, such as natural language, facial expression, and …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 3

Hybrid improvements in multimodal analysis for deep video understanding

B Zhang, F Yu, Y Fang, T Ren, G Wu - Proceedings of the 3rd ACM …, 2021 - dl.acm.org

The Deep Video Understanding Challenge (DVU) is a task that focuses on comprehending
long duration videos which involve many entities. Its main goal is to build relationship and …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

A Multi-Stream Approach for Video Understanding

L Kunam, L Rossetto, A Bernstein - Proceedings of the 30th ACM …, 2022 - dl.acm.org

The automatic annotation of higher-level semantic information in long-form video content is
still a challenging task. The Deep Video Understanding (DVU) Challenge aims at catalyzing …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding

E Loc, K Curtis, G Awad, S Rajput… - Proceedings of the 2nd …, 2022 - aclanthology.org

In this paper we introduce our approach and methods for collecting and annotating a new
dataset for deep video understanding. The proposed dataset is composed of 3 seasons (15 …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Joint learning for relationship and interaction analysis in video with multimodal feature fusion

RETRACTED ARTICLE: ICDN: integrating consistency and difference networks by transformer for multimodal sentiment analysis

TSPNet: Translation supervised prototype network via residual learning for multimodal social relation extraction

Two stage multi-modal modeling for video interaction analysis in deep video understanding challenge

Shifted GCN-GAT and Cumulative-Transformer based Social Relation Recognition for Long Videos

Multimodal analysis for deep video understanding with video language transformer

Multimodal early fusion operators for temporal video scene segmentation tasks

MT-TCCT: Multi-task learning for multimodal emotion recognition

Hybrid improvements in multimodal analysis for deep video understanding

A Multi-Stream Approach for Video Understanding

Development of a MultiModal Annotation Framework and Dataset for Deep Video Understanding