Google znalac

H Li, G Zhu, L Zhang, Y Jiang, Y Dang, H Hou, P Shen… - Neurocomputing, 2024 - Elsevier

Deep learning techniques have led to remarkable breakthroughs in the field of object
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …

Spremi Citiraj Spominje se 114 puta Srodni članci Svih 8 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Video-of-thought: Step-by-step video reasoning from perception to cognition

H Fei, S Wu, W Ji, H Zhang, M Zhang, ML Lee… - arxiv preprint arxiv …, 2024 - arxiv.org

Existing research of video understanding still struggles to achieve in-depth comprehension
and reasoning in complex videos, primarily due to the under-exploration of two key …

Spremi Citiraj Spominje se 71 puta Srodni članci Svih 9 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reltr: Relation transformer for scene graph generation

Y Cong, MY Yang, B Rosenhahn - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Different objects in the same scene are more or less related to each other, but only a limited
number of these relationships are noteworthy. Inspired by Detection Transformer, which …

Spremi Citiraj Spominje se 172 puta Srodni članci Svih 12 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Video transformers: A survey

J Selva, AS Johansen, S Escalera… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …

Spremi Citiraj Spominje se 137 puta Srodni članci Svih 10 inačica

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Text to image generation with semantic-spatial aware gan

W Liao, K Hu, MY Yang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Text-to-image synthesis (T2I) aims to generate photo-realistic images which are
semantically consistent with the text descriptions. Existing methods are usually built upon …

Spremi Citiraj Spominje se 155 puta Srodni članci Svih 7 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Constructing holistic spatio-temporal scene graph for video semantic role labeling

Y Zhao, H Fei, Y Cao, B Li, M Zhang, J Wei… - Proceedings of the 31st …, 2023 - dl.acm.org

As one of the core video semantic understanding tasks, Video Semantic Role Labeling
(VidSRL) aims to detect the salient events from given videos, by recognizing the predict …

Spremi Citiraj Spominje se 46 puta Srodni članci Svih 5 inačica

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Delving into sequential patches for deepfake detection

J Guan, H Zhou, Z Hong, E Ding… - Advances in …, 2022 - proceedings.neurips.cc

Recent advances in face forgery techniques produce nearly visually untraceable deepfake
videos, which could be leveraged with malicious intentions. As a result, researchers have …

Spremi Citiraj Spominje se 58 puta Srodni članci Svih 6 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Master: Market-guided stock transformer for stock price forecasting

T Li, Z Liu, Y Shen, X Wang, H Chen… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Stock price forecasting has remained an extremely challenging problem for many decades
due to the high volatility of the stock market. Recent efforts have been devoted to modeling …

Spremi Citiraj Spominje se 24 puta Srodni članci Svih 4 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Sportshhi: A dataset for human-human interaction detection in sports videos

T Wu, R He, G Wu, L Wang - … of the IEEE/CVF conference on …, 2024 - openaccess.thecvf.com

Video-based visual relation detection tasks such as video scene graph generation play
important roles in fine-grained video understanding. However current video visual relation …

Spremi Citiraj Spominje se 7 puta Srodni članci Svih 6 inačica Prikaži kao HTML

Region-focused multi-view transformer-based generative adversarial network for cardiac cine MRI reconstruction

J Lyu, G Li, C Wang, C Qin, S Wang, Q Dou, J Qin - Medical Image Analysis, 2023 - Elsevier

Cardiac cine magnetic resonance imaging (MRI) reconstruction is challenging due to spatial
and temporal resolution trade-offs. Temporal correlation in cardiac cine MRI is informative …

Spremi Citiraj Spominje se 39 puta Srodni članci Svih 3 inačica

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Spatial-temporal transformer for dynamic scene graph generation

[HTML][HTML] Scene graph generation: A comprehensive survey

Video-of-thought: Step-by-step video reasoning from perception to cognition

Reltr: Relation transformer for scene graph generation

Video transformers: A survey

Text to image generation with semantic-spatial aware gan

Constructing holistic spatio-temporal scene graph for video semantic role labeling

Delving into sequential patches for deepfake detection

Master: Market-guided stock transformer for stock price forecasting

Sportshhi: A dataset for human-human interaction detection in sports videos

Region-focused multi-view transformer-based generative adversarial network for cardiac cine MRI reconstruction