Scientific discovery in the age of artificial intelligence

H Wang, T Fu, Y Du, W Gao, K Huang, Z Liu… - Nature, 2023 - nature.com
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment
and accelerate research, hel** scientists to generate hypotheses, design experiments …

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H **, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

Faith and fate: Limits of transformers on compositionality

N Dziri, X Lu, M Sclar, XL Li, L Jiang… - Advances in …, 2024 - proceedings.neurips.cc
Transformer large language models (LLMs) have sparked admiration for their exceptional
performance on tasks that demand intricate multi-step reasoning. Yet, these models …

End-to-end autonomous driving: Challenges and frontiers

L Chen, P Wu, K Chitta, B Jaeger… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The autonomous driving community has witnessed a rapid growth in approaches that
embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle …

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

S Tong, E Brown, P Wu, S Woo, M Middepogu… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-
centric approach. While stronger language models can enhance multimodal capabilities, the …

Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press
We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

When and why vision-language models behave like bags-of-words, and what to do about it?

M Yuksekgonul, F Bianchi, P Kalluri, D Jurafsky… - arxiv preprint arxiv …, 2022 - arxiv.org
Despite the success of large vision and language models (VLMs) in many downstream
applications, it is unclear how well they encode compositional information. Here, we create …

Self-supervised learning in medicine and healthcare

R Krishnan, P Rajpurkar, EJ Topol - Nature Biomedical Engineering, 2022 - nature.com
The development of medical applications of machine learning has required manual
annotation of data, often by medical experts. Yet, the availability of large-scale unannotated …