Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Advancing transformer architecture in long-context large language models: A comprehensive survey
Transformer-based Large Language Models (LLMs) have been applied in diverse areas
such as knowledge bases, human interfaces, and dynamic agents, and marking a stride …
such as knowledge bases, human interfaces, and dynamic agents, and marking a stride …
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu… - arxiv preprint arxiv …, 2023 - arxiv.org
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable
capabilities across image, audio, video, and text understanding. The Gemini family consists …
capabilities across image, audio, video, and text understanding. The Gemini family consists …
Llama 2: Open foundation and fine-tuned chat models
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large
language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine …
language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine …
Large language models for software engineering: Survey and open problems
This paper provides a survey of the emerging area of Large Language Models (LLMs) for
Software Engineering (SE). It also sets out open research challenges for the application of …
Software Engineering (SE). It also sets out open research challenges for the application of …
Retentive network: A successor to transformer for large language models
In this work, we propose Retentive Network (RetNet) as a foundation architecture for large
language models, simultaneously achieving training parallelism, low-cost inference, and …
language models, simultaneously achieving training parallelism, low-cost inference, and …
Extending context window of large language models via positional interpolation
We present Position Interpolation (PI) that extends the context window sizes of RoPE-based
pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within …
pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within …
Flashattention-3: Fast and accurate attention with asynchrony and low-precision
Attention, as a core layer of the ubiquitous Transformer architecture, is the bottleneck for
large language models and long-context applications. elaborated an approach to speed up …
large language models and long-context applications. elaborated an approach to speed up …
Longbench: A bilingual, multitask benchmark for long context understanding
Although large language models (LLMs) demonstrate impressive performance for many
language tasks, most of them can only handle texts a few thousand tokens long, limiting their …
language tasks, most of them can only handle texts a few thousand tokens long, limiting their …
Diagonal state spaces are as effective as structured state spaces
Modeling long range dependencies in sequential data is a fundamental step towards
attaining human-level performance in many modalities such as text, vision, audio and video …
attaining human-level performance in many modalities such as text, vision, audio and video …
Same task, more tokens: the impact of input length on the reasoning performance of large language models
This paper explores the impact of extending input lengths on the capabilities of Large
Language Models (LLMs). Despite LLMs advancements in recent times, their performance …
Language Models (LLMs). Despite LLMs advancements in recent times, their performance …