- Academic Search

mgte: Generalized long-context text representation and reranking models for multilingual text retrieval

X Zhang, Y Zhang, D Long, W **e, Z Dai, J Tang… - arxiv preprint arxiv …, 2024 - arxiv.org

We present systematic efforts in building long-context multilingual text representation model
(TRM) and reranker from scratch for text retrieval. We first introduce a text encoder (base …

保存引用被引用数: 13 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Normalization and effective learning rates in reinforcement learning

C Lyle, Z Zheng, K Khetarpal, J Martens… - arxiv preprint arxiv …, 2024 - arxiv.org

Normalization layers have recently experienced a renaissance in the deep reinforcement
learning and continual learning literature, with several works highlighting diverse benefits …

保存引用被引用数: 3 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

On the overlooked structure of stochastic gradients

Z **e, QY Tang, M Sun, P Li - Advances in Neural …, 2023 - proceedings.neurips.cc

Stochastic gradients closely relate to both optimization and generalization of deep neural
networks (DNNs). Some works attempted to explain the success of stochastic optimization …

保存引用被引用数: 5 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Neural networks with (low-precision) polynomial approximations: New insights and techniques for accuracy improvement

C Zhang, J Fan, MH Au, SM Yiu - arxiv preprint arxiv:2402.11224, 2024 - arxiv.org

Replacing non-polynomial functions (eg, non-linear activation functions such as ReLU) in a
neural network with their polynomial approximations is a standard practice in privacy …

保存引用被引用数: 1 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

W Deng, Y Zhao, V Vakilian, M Chen, X Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Storing open-source fine-tuned models separately introduces redundancy and increases
response times in applications utilizing multiple models. Delta-parameter pruning (DPP) …

保存引用関連記事 HTMLバージョン

[Free GPT-4]

[PDF] biorxiv.org

ConoDL: a deep learning framework for rapid generation and prediction of conotoxins

M Guo, Z Li, X Deng, D Luo, J Yang, Y Chen… - Journal of Computer …, 2025 - Springer

Conotoxins, being small disulfide-rich and bioactive peptides, manifest notable
pharmacological potential and find extensive applications. However, the exploration of …

保存引用関連記事全 6 バージョン

[Free GPT-4]

[PDF] arxiv.org

Neural Field Classifiers via Target Encoding and Classification Loss

X Yang, Z **e, X Zhou, B Liu, B Liu, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Neural field methods have seen great progress in various long-standing tasks in computer
vision and computer graphics, including novel view synthesis and geometry reconstruction …

保存引用関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Weight decay induces low-rank attention layers

S Kobayashi, Y Akram, J Von Oswald - arxiv preprint arxiv:2410.23819, 2024 - arxiv.org

The effect of regularizers such as weight decay when training deep neural networks is not
well understood. We study the influence of weight decay as well as $ L2 $-regularization …

保存引用関連記事全 3 バージョン HTMLバージョン

Avoiding Catastrophic Forgetting Via Neuronal Decay

RO Malashin, MA Mikhalkova - 2024 Wave Electronics and its …, 2024 - ieeexplore.ieee.org

In continual learning settings neural network is taught different tasks sequentially and the
network is prone to catastrophic forgetting. We investigate the role of regularization methods …

保存引用被引用数: 1 関連記事

アラートを作成

引用

検索オプション

マイライブラリに保存しました

On the overlooked pitfalls of weight decay and how to mitigate them: A gradient-norm perspective

mgte: Generalized long-context text representation and reranking models for multilingual text retrieval

Normalization and effective learning rates in reinforcement learning

On the overlooked structure of stochastic gradients

Neural networks with (low-precision) polynomial approximations: New insights and techniques for accuracy improvement

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

ConoDL: a deep learning framework for rapid generation and prediction of conotoxins

Neural Field Classifiers via Target Encoding and Classification Loss

Weight decay induces low-rank attention layers

Avoiding Catastrophic Forgetting Via Neuronal Decay