Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Multimodal biomedical AI

JN Acosta, GJ Falcone, P Rajpurkar, EJ Topol - Nature Medicine, 2022 - nature.com
The increasing availability of biomedical data from large biobanks, electronic health records,
medical imaging, wearable and ambient biosensors, and the lower cost of genome and …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Re-thinking data strategy and integration for artificial intelligence: concepts, opportunities, and challenges

A Aldoseri, KN Al-Khalifa, AM Hamouda - Applied Sciences, 2023 - mdpi.com
The use of artificial intelligence (AI) is becoming more prevalent across industries such as
healthcare, finance, and transportation. Artificial intelligence is based on the analysis of …

Multimodal learning with transformers: A survey

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

Artificial intelligence for multimodal data integration in oncology

J Lipkova, RJ Chen, B Chen, MY Lu, M Barbieri… - Cancer cell, 2022 - cell.com
In oncology, the patient state is characterized by a whole spectrum of modalities, ranging
from radiology, histology, and genomics to electronic health records. Current artificial …

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arxiv preprint arxiv …, 2023 - arxiv.org
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer

J Ogier du Terrail, A Leopold, C Joly, C Béguier… - Nature medicine, 2023 - nature.com
Triple-negative breast cancer (TNBC) is a rare cancer, characterized by high metastatic
potential and poor prognosis, and has limited treatment options. The current standard of …

Versatile diffusion: Text, images and variations all in one diffusion model

X Xu, Z Wang, G Zhang, K Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in diffusion models have set an impressive milestone in many generation
tasks, and trending works such as DALL-E2, Imagen, and Stable Diffusion have attracted …

Deep neural networks and tabular data: A survey

V Borisov, T Leemann, K Seßler, J Haug… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Heterogeneous tabular data are the most commonly used form of data and are essential for
numerous critical and computationally demanding applications. On homogeneous datasets …