Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Advancing transformer architecture in long-context large language models: A comprehensive survey
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …
Rwkv: Reinventing rnns for the transformer era
Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …
suffer from memory and computational complexity that scales quadratically with sequence …
Spatten: Efficient sparse attention architecture with cascade token and head pruning
The attention mechanism is becoming increasingly popular in Natural Language Processing
(NLP) applications, showing superior performance than convolutional and recurrent …
(NLP) applications, showing superior performance than convolutional and recurrent …
Enable deep learning on mobile devices: Methods, systems, and applications
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial
intelligence (AI), including computer vision, natural language processing, and speech …
intelligence (AI), including computer vision, natural language processing, and speech …
Recnmp: Accelerating personalized recommendation with near-memory processing
Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …
ELSA: Hardware-software co-design for efficient, lightweight self-attention mechanism in neural networks
The self-attention mechanism is rapidly emerging as one of the most important key primitives
in neural networks (NNs) for its ability to identify the relations within input entities. The self …
in neural networks (NNs) for its ability to identify the relations within input entities. The self …
Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning
Recent studies from several hyperscalars pinpoint to embedding layers as the most memory-
intensive deep learning (DL) algorithm being deployed in today's datacenters. This paper …
intensive deep learning (DL) algorithm being deployed in today's datacenters. This paper …
Self-attention Does Not Need Memory
MN Rabe, C Staats - arxiv preprint arxiv:2112.05682, 2021 - arxiv.org
We present a very simple algorithm for attention that requires $ O (1) $ memory with respect
to sequence length and an extension to self-attention that requires $ O (\log n) $ memory …
to sequence length and an extension to self-attention that requires $ O (\log n) $ memory …
Beyond efficiency: A systematic survey of resource-efficient large language models
The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated
models like OpenAI's ChatGPT, represents a significant advancement in artificial …
models like OpenAI's ChatGPT, represents a significant advancement in artificial …
Recent advances in neural text generation: A task-agnostic survey
In recent years, considerable research has been dedicated to the application of neural
models in the field of natural language generation (NLG). The primary objective is to …
models in the field of natural language generation (NLG). The primary objective is to …