- Academic Search

YN Dauphin, A Fan, M Auli… - … conference on machine …, 2017 - proceedings.mlr.press

The pre-dominant approach to language modeling to date is based on recurrent neural
networks. Their success on this task is often linked to their ability to capture unbounded …

Save Cite Cited by 3065 Related articles All 11 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] nust.edu.pk

[BOOK][B] Neural network methods in natural language processing

Y Goldberg - 2017 - books.google.com

Neural networks are a family of powerful machine learning models and this book focuses on
their application to natural language data. The first half of the book (Parts I and II) covers the …

Save Cite Cited by 1993 Related articles All 12 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fasttext. zip: Compressing text classification models

A Joulin, E Grave, P Bojanowski, M Douze… - arxiv preprint arxiv …, 2016 - arxiv.org

We consider the problem of producing compact architectures for text classification, such that
the full model fits in a limited amount of memory. After considering different solutions …

Save Cite Cited by 1815 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards energy-efficient deep learning: An overview of energy-efficient approaches along the deep learning lifecycle

V Mehlin, S Schacht, C Lanquillon - arxiv preprint arxiv:2303.01980, 2023 - arxiv.org

Deep Learning has enabled many advances in machine learning applications in the last few
years. However, since current Deep Learning algorithms require much energy for …

Save Cite Cited by 27 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pre-training tasks for embedding-based large-scale retrieval

WC Chang, FX Yu, YW Chang, Y Yang… - arxiv preprint arxiv …, 2020 - arxiv.org

We consider the large-scale query-document retrieval problem: given a query (eg, a
question), return the set of relevant documents (eg, paragraphs containing the answer) from …

Save Cite Cited by 345 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] sharir.org

[PDF][PDF] Jurassic-1: Technical details and evaluation

O Lieber, O Sharir, B Lenz, Y Shoham - White Paper. AI21 Labs, 2021 - sharir.org

Jurassic-1 is a pair of auto-regressive language models recently released by AI21 Labs,
consisting of J1-Jumbo, a 178B-parameter model, and J1-Large, a 7B-parameter model. We …

Save Cite Cited by 190 Related articles View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

An introduction to neural information retrieval

B Mitra, N Craswell - Foundations and Trends® in Information …, 2018 - nowpublishers.com

Neural ranking models for information retrieval (IR) use shallow or deep neural networks to
rank search results in response to a query. Traditional learning to rank models employ …

Save Cite Cited by 445 Related articles All 8 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Nonparametric masked language modeling

S Min, W Shi, M Lewis, X Chen, W Yih… - arxiv preprint arxiv …, 2022 - arxiv.org

Existing language models (LMs) predict tokens with a softmax over a finite vocabulary,
which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first …

Save Cite Cited by 69 Related articles All 7 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning visual features from large weakly supervised data

A Joulin, L Van Der Maaten, A Jabri… - Computer Vision–ECCV …, 2016 - Springer

Convolutional networks trained on large supervised datasets produce visual features which
form the basis for the state-of-the-art in many computer-vision problems. Further …

Save Cite Cited by 455 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring sparsity in recurrent neural networks

S Narang, E Elsen, G Diamos, S Sengupta - arxiv preprint arxiv …, 2017 - arxiv.org

Recurrent Neural Networks (RNN) are widely used to solve a variety of problems and as the
quantity of data and the amount of available compute have increased, so have model sizes …

Save Cite Cited by 382 Related articles All 8 versions Free GPT-4 DeepSeek View as HTML

Create alert

Cite

Advanced search

Saved to My library

Strategies for training large vocabulary neural language models

Language modeling with gated convolutional networks

[BOOK][B] Neural network methods in natural language processing

Fasttext. zip: Compressing text classification models

Towards energy-efficient deep learning: An overview of energy-efficient approaches along the deep learning lifecycle

Pre-training tasks for embedding-based large-scale retrieval

[PDF][PDF] Jurassic-1: Technical details and evaluation

An introduction to neural information retrieval

Nonparametric masked language modeling

Learning visual features from large weakly supervised data

Exploring sparsity in recurrent neural networks