Detecting harmful content on online platforms: what platforms need vs. where research efforts go

A Arora, P Nakov, M Hardalov, SM Sarwar… - ACM Computing …, 2023‏ - dl.acm.org
The proliferation of harmful content on online platforms is a major societal problem, which
comes in many different forms, including hate speech, offensive language, bullying and …

Multimodal pretraining unmasked: A meta-analysis and a unified framework of vision-and-language BERTs

E Bugliarello, R Cotterell, N Okazaki… - Transactions of the …, 2021‏ - direct.mit.edu
Large-scale pretraining and task-specific fine-tuning is now the standard methodology for
many tasks in computer vision and natural language processing. Recently, a multitude of …

Dual scene graph convolutional network for motivation prediction

Y Wanyan, X Yang, X Ma, C Xu - ACM Transactions on Multimedia …, 2023‏ - dl.acm.org
Humans can easily infer the motivations behind human actions from only visual data by
comprehensively analyzing the complex context information and utilizing abundant life …

Achieving Human Parity on Visual Question Answering

M Yan, H Xu, C Li, J Tian, B Bi, W Wang, X Xu… - ACM Transactions on …, 2023‏ - dl.acm.org
The Visual Question Answering (VQA) task utilizes both visual image and language analysis
to answer a textual question with respect to an image. It has been a popular research topic …

Knowledge-integrated Multi-modal Movie Turning Point Identification

D Wang, R Xu, L Cheng, Z Wang - ACM Transactions on Multimedia …, 2024‏ - dl.acm.org
The rapid development of artificial intelligence provides rich technologies and tools for the
automated understanding of literary works. As a comprehensive carrier of storylines, movies …