Random feature attention

H Peng, N Pappas, D Yogatama, R Schwartz… - arxiv preprint arxiv …, 2021 - arxiv.org
Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their
core is an attention function which models pairwise interactions between the inputs at every …

A generalizable and accessible approach to machine learning with global satellite imagery

E Rolf, J Proctor, T Carleton, I Bolliger… - Nature …, 2021 - nature.com
Combining satellite imagery with machine learning (SIML) has the potential to address
global challenges by remotely estimating socioeconomic and environmental conditions in …

Implicit kernel learning

CL Li, WC Chang, Y Mroueh, Y Yang… - The 22nd …, 2019 - proceedings.mlr.press
Kernels are powerful and versatile tools in machine learning and statistics. Although the
notion of universal kernels and characteristic kernels has been studied, kernel selection still …

Software and application patterns for explanation methods

M Alber - Explainable AI: interpreting, explaining and visualizing …, 2019 - Springer
Deep neural networks successfully pervaded many applications domains and are
increasingly used in critical decision processes. Understanding their workings is desirable …

Uncertainty-aware (una) bases for deep bayesian regression using multi-headed auxiliary networks

S Thakur, C Lorsung, Y Yacoby, F Doshi-Velez… - arxiv preprint arxiv …, 2020 - arxiv.org
Neural Linear Models (NLM) are deep Bayesian models that produce predictive
uncertainties by learning features from the data and then performing Bayesian linear …

Detecting local insights from global labels: supervised and zero-shot sequence labeling via a convolutional decomposition

A Schmaltz - Computational Linguistics, 2021 - direct.mit.edu
We propose a new, more actionable view of neural network interpretability and data analysis
by leveraging the remarkable matching effectiveness of representations derived from deep …

Predicting pairwise relations with neural similarity encoders

F Horn, KR Müller - arxiv preprint arxiv:1702.01824, 2017 - arxiv.org
Matrix factorization is at the heart of many machine learning algorithms, for example,
dimensionality reduction (eg kernel PCA) or recommender systems relying on collaborative …

How to iNNvestigate neural networks' predictions!

M Alber, S Lapuschkin, P Seegerer, M Hägele… - 2018 - openreview.net
In recent years, deep neural networks have revolutionized many application domains of
machine learning and are key components of many critical decision or predictive processes …

Understanding uncertainty in bayesian deep learning

C Lorsung - arxiv preprint arxiv:2106.13055, 2021 - arxiv.org
Neural Linear Models (NLM) are deep Bayesian models that produce predictive uncertainty
by learning features from the data and then performing Bayesian linear regression over …

On Neural Linear Model Prediction, with Applications to Nonstationary Settings

M Guo - 2023 - dash.harvard.edu
Neural Linear Models (NLMs) are deep Bayesian machine learning models that appear in a
variety of contexts due to their data adaptivity and model flexibility, including many settings …