[HTML][HTML] Social media data for conservation science: A methodological overview

T Toivonen, V Heikinheimo, C Fink, A Hausmann… - Biological …, 2019 - Elsevier
Improved understanding of human-nature interactions is crucial to conservation science and
practice, but collecting relevant data remains challenging. Recently, social media have …

A comprehensive survey of scene graphs: Generation and application

X Chang, P Ren, P Xu, Z Li, X Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Scene graph is a structured representation of a scene that can clearly express the objects,
attributes, and relationships between objects in the scene. As computer vision technology …

A metaverse: Taxonomy, components, applications, and open challenges

SM Park, YG Kim - IEEE access, 2022 - ieeexplore.ieee.org
Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is
based on the social value of Generation Z that online and offline selves are not different …

Semantic communications: Principles and challenges

Z Qin, X Tao, J Lu, W Tong, GY Li - arxiv preprint arxiv:2201.01389, 2021 - arxiv.org
Semantic communication, regarded as the breakthrough beyond the Shannon paradigm,
aims at the successful transmission of semantic information conveyed by the source rather …

Merlot reserve: Neural script knowledge through vision and language and sound

R Zellers, J Lu, X Lu, Y Yu, Y Zhao… - Proceedings of the …, 2022 - openaccess.thecvf.com
As humans, we navigate a multimodal world, building a holistic understanding from all our
senses. We introduce MERLOT Reserve, a model that represents videos jointly over time …

The all-seeing project v2: Towards general relation comprehension of the open world

W Wang, Y Ren, H Luo, T Li, C Yan, Z Chen… - … on Computer Vision, 2024 - Springer
Abstract We present the All-Seeing Project V2: a new model and dataset designed for
understanding object relations in images. Specifically, we propose the All-Seeing Model V2 …

[HTML][HTML] Cpt: Colorful prompt tuning for pre-trained vision-language models

Y Yao, A Zhang, Z Zhang, Z Liu, TS Chua, M Sun - AI Open, 2024 - Elsevier
Abstract Vision-Language Pre-training (VLP) models have shown promising capabilities in
grounding natural language in image data, facilitating a broad range of cross-modal tasks …

Causal intervention for weakly-supervised semantic segmentation

D Zhang, H Zhang, J Tang… - Advances in Neural …, 2020 - proceedings.neurips.cc
We present a causal inference framework to improve Weakly-Supervised Semantic
Segmentation (WSSS). Specifically, we aim to generate better pixel-level pseudo-masks by …

Imagine that! abstract-to-intricate text-to-image synthesis with scene graph hallucination diffusion

S Wu, H Fei, H Zhang, TS Chua - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we investigate the task of text-to-image (T2I) synthesis under the abstract-to-
intricate setting, ie, generating intricate visual content from simple abstract text prompts …

Unbiased scene graph generation from biased training

K Tang, Y Niu, J Huang, J Shi… - Proceedings of the …, 2020 - openaccess.thecvf.com
Today's scene graph generation (SGG) task is still far from practical, mainly due to the
severe training bias, eg, collapsing diverse" human walk on/sit on/lay on beach" into" human …