Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
On statistical rates and provably efficient criteria of latent diffusion transformers (dits)
We investigate the statistical and computational limits of latent Diffusion Transformers (DiTs)
under the low-dimensional linear latent space assumption. Statistically, we study the …
under the low-dimensional linear latent space assumption. Statistically, we study the …
Outlier-efficient hopfield layers for large transformer-based models
We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm {OutEffHop} $)
and use it to address the outlier inefficiency problem of {training} gigantic transformer-based …
and use it to address the outlier inefficiency problem of {training} gigantic transformer-based …
Tensor attention training: Provably efficient learning of higher-order transformers
Tensor Attention, a multi-view attention that is able to capture high-order correlations among
multiple modalities, can overcome the representational limitations of classical matrix …
multiple modalities, can overcome the representational limitations of classical matrix …
Uniform memory retrieval with larger capacity for modern hopfield models
We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed
$\mathtt {U\text {-} Hop} $, with enhanced memory capacity. Our key contribution is a …
$\mathtt {U\text {-} Hop} $, with enhanced memory capacity. Our key contribution is a …
On computational limits of modern hopfield models: A fine-grained complexity analysis
We investigate the computational limits of the memory retrieval dynamics of modern Hopfield
models from the fine-grained complexity analysis. Our key contribution is the …
models from the fine-grained complexity analysis. Our key contribution is the …
Multi-layer transformers gradient can be approximated in almost linear time
The computational complexity of the self-attention mechanism in popular transformer
architectures poses significant challenges for training and inference, and becomes the …
architectures poses significant challenges for training and inference, and becomes the …
Hsr-enhanced sparse attention acceleration
Large Language Models (LLMs) have demonstrated remarkable capabilities across various
applications, but their performance on long-context tasks is often limited by the …
applications, but their performance on long-context tasks is often limited by the …
Out-of-distribution generalization via composition: a lens through induction heads in transformers
Large language models (LLMs) such as GPT-4 sometimes appear to be creative, solving
novel tasks often with a few demonstrations in the prompt. These tasks require the models to …
novel tasks often with a few demonstrations in the prompt. These tasks require the models to …
The closeness of in-context learning and weight shifting for softmax regression
Large language models (LLMs) are known for their exceptional performance in natural
language processing, making them highly effective in many human life-related or even job …
language processing, making them highly effective in many human life-related or even job …