Deep neural networks and tabular data: A survey
Heterogeneous tabular data are the most commonly used form of data and are essential for
numerous critical and computationally demanding applications. On homogeneous datasets …
numerous critical and computationally demanding applications. On homogeneous datasets …
An attentive survey of attention models
Attention Model has now become an important concept in neural networks that has been
researched within diverse application domains. This survey provides a structured and …
researched within diverse application domains. This survey provides a structured and …
Surface form competition: Why the highest probability answer isn't always right
Large language models have shown promising results in zero-shot settings (Brown et al.,
2020; Radford et al., 2019). For example, they can perform multiple choice tasks simply by …
2020; Radford et al., 2019). For example, they can perform multiple choice tasks simply by …
Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
Adversarial sparse transformer for time series forecasting
Many approaches have been proposed for time series forecasting, in light of its significance
in wide applications including business demand prediction. However, the existing methods …
in wide applications including business demand prediction. However, the existing methods …
scTab: scaling cross-tissue single-cell annotation models
Identifying cellular identities is a key use case in single-cell transcriptomics. While machine
learning has been leveraged to automate cell annotation predictions for some time, there …
learning has been leveraged to automate cell annotation predictions for some time, there …
Neural oblivious decision ensembles for deep learning on tabular data
Nowadays, deep neural networks (DNNs) have become the main instrument for machine
learning tasks within a wide range of domains, including vision, NLP, and speech …
learning tasks within a wide range of domains, including vision, NLP, and speech …
A survey on green deep learning
In recent years, larger and deeper models are springing up and continuously pushing state-
of-the-art (SOTA) results across various fields like natural language processing (NLP) and …
of-the-art (SOTA) results across various fields like natural language processing (NLP) and …
Adaptively sparse transformers
Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the
Transformer, learn powerful context-aware word representations through layered, multi …
Transformer, learn powerful context-aware word representations through layered, multi …
Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning
Abstract The Mixture-of-Experts (MoE) architecture is showing promising results in improving
parameter sharing in multi-task learning (MTL) and in scaling high-capacity neural networks …
parameter sharing in multi-task learning (MTL) and in scaling high-capacity neural networks …