A comprehensive survey on applications of transformers for deep learning tasks

S Islam, H Elmekki, A Elsebai, J Bentahar… - Expert Systems with …, 2024 - Elsevier
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …

[HTML][HTML] Deep Learning applications for COVID-19

C Shorten, TM Khoshgoftaar, B Furht - Journal of big Data, 2021 - Springer
This survey explores how Deep Learning has battled the COVID-19 pandemic and provides
directions for future research on COVID-19. We cover Deep Learning applications in Natural …

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

G Team, P Georgiev, VI Lei, R Burnell, L Bai… - arxiv preprint arxiv …, 2024 - arxiv.org
In this report, we introduce the Gemini 1.5 family of models, representing the next generation
of highly compute-efficient multimodal models capable of recalling and reasoning over fine …

Faith and fate: Limits of transformers on compositionality

N Dziri, X Lu, M Sclar, XL Li, L Jiang… - Advances in …, 2024 - proceedings.neurips.cc
Transformer large language models (LLMs) have sparked admiration for their exceptional
performance on tasks that demand intricate multi-step reasoning. Yet, these models …

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arxiv preprint arxiv …, 2022 - arxiv.org
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

Large language models can be easily distracted by irrelevant context

F Shi, X Chen, K Misra, N Scales… - International …, 2023 - proceedings.mlr.press
Large language models have achieved impressive performance on various natural
language processing tasks. However, so far they have been evaluated primarily on …

Evaluating large language models in generating synthetic hci research data: a case study

P Hämäläinen, M Tavast, A Kunnari - … of the 2023 CHI Conference on …, 2023 - dl.acm.org
Collecting data is one of the bottlenecks of Human-Computer Interaction (HCI) research.
Motivated by this, we explore the potential of large language models (LLMs) in generating …

Chain-of-thought prompting elicits reasoning in large language models

J Wei, X Wang, D Schuurmans… - Advances in neural …, 2022 - proceedings.neurips.cc
We explore how generating a chain of thought---a series of intermediate reasoning steps---
significantly improves the ability of large language models to perform complex reasoning. In …

Selection-inference: Exploiting large language models for interpretable logical reasoning

A Creswell, M Shanahan, I Higgins - arxiv preprint arxiv:2205.09712, 2022 - arxiv.org
Large language models (LLMs) have been shown to be capable of impressive few-shot
generalisation to new tasks. However, they still tend to perform poorly on multi-step logical …

Reasoning with language model prompting: A survey

S Qiao, Y Ou, N Zhang, X Chen, Y Yao, S Deng… - arxiv preprint arxiv …, 2022 - arxiv.org
Reasoning, as an essential ability for complex problem-solving, can provide back-end
support for various real-world applications, such as medical diagnosis, negotiation, etc. This …