On efficient training of large-scale deep learning models: A literature review

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arxiv preprint arxiv …, 2023 - arxiv.org
The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

LeanAgent: Lifelong Learning for Formal Theorem Proving

A Kumarappan, M Tiwari, P Song, RJ George… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have been successful in mathematical reasoning tasks
such as formal theorem proving when integrated with interactive proof assistants like Lean …

Robust and resource-efficient table-based fact verification through multi-aspect adversarial contrastive learning

R Liu, Y Zhang, B Yang, Q Shi, L Tian - Information Processing & …, 2024 - Elsevier
Table-based fact verification focuses on determining the truthfulness of statements by cross-
referencing data in tables. This task is challenging due to the complex interactions inherent …

Fisher information-based efficient curriculum federated learning with large language models

J Liu, J Ren, R **, Z Zhang, Y Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
As a promising paradigm to collaboratively train models with decentralized data, Federated
Learning (FL) can be exploited to fine-tune Large Language Models (LLMs). While LLMs …

Communication interchange for artificial intelligence systems

RC Voicu, AK Pande, MH Tanveer… - 2024 International …, 2024 - ieeexplore.ieee.org
The rise and proliferation of Artificial Intelligence (AI) technologies are bringing
transformative changes to various sectors, signaling a new era of innovation in fields as …

QUART: Latency-Aware FaaS System for Pipelining Large Model Inference

Y Lin, Y Li, S Peng, Y Tang, S Luo… - 2024 IEEE 44th …, 2024 - ieeexplore.ieee.org
Pipeline parallelism is a key mechanism to ensure the performance of large model serving
systems. These systems need to deal with unpredictable online workloads with low latency …

Adaptive granular data compression and interval granulation for efficient classification

K Cai, H Zhang, M Li, D Miao - Information Sciences, 2025 - Elsevier
Efficiency is crucial in deep learning tasks and has garnered significant attention in green
deep learning research field. However, existing methods often sacrifice efficiency for slight …

MOFO: MOtion FOcused Self-Supervision for Video Understanding

M Ahmadian, F Guerin, A Gilbert - arxiv preprint arxiv:2308.12447, 2023 - arxiv.org
Self-supervised learning (SSL) techniques have recently produced outstanding results in
learning visual representations from unlabeled videos. Despite the importance of motion in …

Harmonic Loss Trains Interpretable AI Models

DD Baek, Z Liu, R Tyagi, M Tegmark - arxiv preprint arxiv:2502.01628, 2025 - arxiv.org
In this paper, we introduce** harmonic loss** as an alternative to the standard cross-entropy
loss for training neural networks and large language models (LLMs). Harmonic loss enables …

SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation

J Kim, M Kim, S Lee - arxiv preprint arxiv:2502.04774, 2025 - arxiv.org
The rapid evolution of Large Language Models (LLMs) has enabled the industry to develop
various AI-based services. Instruction tuning is considered essential in adapting foundation …