Towards energy-efficient deep learning: An overview of energy-efficient approaches along the deep learning lifecycle

V Mehlin, S Schacht, C Lanquillon - arxiv preprint arxiv:2303.01980, 2023 - arxiv.org
Deep Learning has enabled many advances in machine learning applications in the last few
years. However, since current Deep Learning algorithms require much energy for …

Imagenet-21k pretraining for the masses

T Ridnik, E Ben-Baruch, A Noy… - arxiv preprint arxiv …, 2021 - arxiv.org
ImageNet-1K serves as the primary dataset for pretraining deep learning models for
computer vision tasks. ImageNet-21K dataset, which is bigger and more diverse, is used …

Beyond one-hot encoding: Lower dimensional target embedding

P Rodríguez, MA Bautista, J Gonzalez… - Image and Vision …, 2018 - Elsevier
Target encoding plays a central role when learning Convolutional Neural Networks. In this
realm, one-hot encoding is the most prevalent strategy due to its simplicity. However, this so …

A survey on green deep learning

J Xu, W Zhou, Z Fu, H Zhou, L Li - arxiv preprint arxiv:2111.05193, 2021 - arxiv.org
In recent years, larger and deeper models are springing up and continuously pushing state-
of-the-art (SOTA) results across various fields like natural language processing (NLP) and …

Large memory layers with product keys

G Lample, A Sablayrolles… - Advances in …, 2019 - proceedings.neurips.cc
This paper introduces a structured memory which can be easily integrated into a neural
network. The memory is very large by design and significantly increases the capacity of the …

Strategies for training large vocabulary neural language models

W Chen, D Grangier, M Auli - arxiv preprint arxiv:1512.04906, 2015 - arxiv.org
Training neural network language models over large vocabularies is still computationally
very costly compared to count-based models such as Kneser-Ney. At the same time, neural …

A no-regret generalization of hierarchical softmax to extreme multi-label classification

M Wydmuch, K Jasinska, M Kuznetsov… - Advances in neural …, 2018 - proceedings.neurips.cc
Extreme multi-label classification (XMLC) is a problem of tagging an instance with a small
subset of relevant labels chosen from an extremely large pool of possible labels. Large label …

Hierarchical memory networks

S Chandar, S Ahn, H Larochelle, P Vincent… - arxiv preprint arxiv …, 2016 - arxiv.org
Memory networks are neural networks with an explicit memory component that can be both
read and written to by the network. The memory is often addressed in a soft way using a …

Sampled softmax with random fourier features

AS Rawat, J Chen, FXX Yu… - Advances in Neural …, 2019 - proceedings.neurips.cc
The computational cost of training with softmax cross entropy loss grows linearly with the
number of classes. For the settings where a large number of classes are involved, a …

Efficient training of retrieval models using negative cache

E Lindgren, S Reddi, R Guo… - Advances in Neural …, 2021 - proceedings.neurips.cc
Factorized models, such as two tower neural network models, are widely used for scoring
(query, document) pairs in information retrieval tasks. These models are typically trained by …