Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Multimodal biomedical AI
The increasing availability of biomedical data from large biobanks, electronic health records,
medical imaging, wearable and ambient biosensors, and the lower cost of genome and …
medical imaging, wearable and ambient biosensors, and the lower cost of genome and …
Foundation models in robotics: Applications, challenges, and the future
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …
learning models in robotics are trained on small datasets tailored for specific tasks, which …
Re-thinking data strategy and integration for artificial intelligence: concepts, opportunities, and challenges
The use of artificial intelligence (AI) is becoming more prevalent across industries such as
healthcare, finance, and transportation. Artificial intelligence is based on the analysis of …
healthcare, finance, and transportation. Artificial intelligence is based on the analysis of …
Multimodal learning with transformers: A survey
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …
Artificial intelligence for multimodal data integration in oncology
In oncology, the patient state is characterized by a whole spectrum of modalities, ranging
from radiology, histology, and genomics to electronic health records. Current artificial …
from radiology, histology, and genomics to electronic health records. Current artificial …
A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …
everywhere because of its ability to analyze and create text, images, and beyond. With such …
Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer
Triple-negative breast cancer (TNBC) is a rare cancer, characterized by high metastatic
potential and poor prognosis, and has limited treatment options. The current standard of …
potential and poor prognosis, and has limited treatment options. The current standard of …
Versatile diffusion: Text, images and variations all in one diffusion model
Recent advances in diffusion models have set an impressive milestone in many generation
tasks, and trending works such as DALL-E2, Imagen, and Stable Diffusion have attracted …
tasks, and trending works such as DALL-E2, Imagen, and Stable Diffusion have attracted …
Deep neural networks and tabular data: A survey
Heterogeneous tabular data are the most commonly used form of data and are essential for
numerous critical and computationally demanding applications. On homogeneous datasets …
numerous critical and computationally demanding applications. On homogeneous datasets …