Google znalac

Y Peng, H Li, Y Zhang, X Sun… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

While recent Transformer-based approaches have shown impressive performances on
event-based object detection tasks their high computational costs still diminish the low …

Spremi Citiraj Spominje se 13 puta Srodni članci Svih 7 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cdac: Cross-domain attention consistency in transformer for domain adaptive semantic segmentation

K Wang, D Kim, R Feris… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

While transformers have greatly boosted performance in semantic segmentation, domain
adaptive transformers are not yet well explored. We identify that the domain gap can cause …

Spremi Citiraj Spominje se 13 puta Srodni članci Svih 4 inačica Prikaži kao HTML

Lightweight convolutional neural networks with context broadcast transformer for real-time semantic segmentation

K Hu, Z **e, Q Hu - Image and Vision Computing, 2024 - Elsevier

With the increasing application of embedded mobile devices in various fields, lightweight
real-time semantic segmentation systems have attracted more and more attention. Many …

Spremi Citiraj Spominje se 2 puta Srodni članci Svih 2 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Worldafford: Affordance grounding based on natural language instructions

C Chen, Y Cong, Z Kan - 2024 IEEE 36th International …, 2024 - ieeexplore.ieee.org

Affordance grounding aims to localize the interaction regions for the manipulated objects in
the scene image according to given instructions, which is essential for Embodied AI and …

Spremi Citiraj Spominje se 4 puta Srodni članci Svih 4 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unitnorm: Rethinking normalization for transformers in time series

N Huang, C Kümmerle, X Zhang - arxiv preprint arxiv:2405.15903, 2024 - arxiv.org

Normalization techniques are crucial for enhancing Transformer models' performance and
stability in time series analysis tasks, yet traditional methods like batch and layer …

Spremi Citiraj Spominje se 2 puta Srodni članci Svih 3 inačica Prikaži kao HTML

A cross-modal collaborative guiding network for sarcasm explanation in multi-modal multi-party dialogues

X Zhuang, Z Li, C Zhang, H Ma - Engineering Applications of Artificial …, 2025 - Elsevier

Indirect forms of language, such as sarcasm, are highly prevalent in contemporary human
daily communication. While the indirect nature of metaphorical language ensures that …

Spremi Citiraj Srodni članci

When multi-view meets multi-level: A novel spatio-temporal transformer for traffic prediction

J Lin, Q Ren, X Lv, H Xu, Y Liu - Information Fusion, 2025 - Elsevier

Traffic prediction is a vital aspect of Intelligent Transportation Systems with widespread
applications. The main challenge is accurately modeling the complex spatial and temporal …

Spremi Citiraj Srodni članci

Dual-encoder network for pavement concrete crack segmentation with multi-stage supervision

J Wang, H Yao, J Hu, Y Ma, J Wang - Automation in Construction, 2025 - Elsevier

Cracks are a prevalent disease on pavement concrete materials. Timely assessment and
repair of concrete materials can significantly extend their service life. However, accurate …

Spremi Citiraj Srodni članci Svih 2 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation

W Li, Z Zhao, H Bai, F Su - arxiv preprint arxiv:2405.15169, 2024 - arxiv.org

Referring Expression Segmentation (RES) has attracted rising attention, aiming to identify
and segment objects based on natural language expressions. While substantial progress …

Spremi Citiraj Spominje se 1 puta Srodni članci Svih 3 inačica Prikaži kao HTML

A Multi-Modal Unified Representation Learning Framework with Masked Image Modeling for Remote Sensing Images

D Du, T Liu, Y Gu - IEEE Transactions on Geoscience and …, 2024 - ieeexplore.ieee.org

The coordinated utilization of diverse types of satellite sensors provides a more
comprehensive view of the Earth's surface. However, due to the significant heterogeneity …

Spremi Citiraj Srodni članci Svih 2 inačica

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Scratching visual transformer's back with uniform attention

Scene adaptive sparse transformer for event-based object detection

Cdac: Cross-domain attention consistency in transformer for domain adaptive semantic segmentation

Lightweight convolutional neural networks with context broadcast transformer for real-time semantic segmentation

Worldafford: Affordance grounding based on natural language instructions

Unitnorm: Rethinking normalization for transformers in time series

A cross-modal collaborative guiding network for sarcasm explanation in multi-modal multi-party dialogues

When multi-view meets multi-level: A novel spatio-temporal transformer for traffic prediction

Dual-encoder network for pavement concrete crack segmentation with multi-stage supervision

Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation

A Multi-Modal Unified Representation Learning Framework with Masked Image Modeling for Remote Sensing Images