Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations
Abstract Large Language Models (LLMs) have recently gained significant attention due to
their remarkable capabilities in performing diverse tasks across various domains. However …
their remarkable capabilities in performing diverse tasks across various domains. However …
The butterfly effect of model editing: Few edits can trigger large language models collapse
Although model editing has shown promise in revising knowledge in Large Language
Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this …
Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this …
Base of rope bounds context length
Position embedding is a core component of current Large Language Models (LLMs). Rotary
position embedding (RoPE), a technique that encodes the position information with a …
position embedding (RoPE), a technique that encodes the position information with a …
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Extending context window sizes allows large language models (LLMs) to process longer
sequences and handle more complex tasks. Rotary Positional Embedding (RoPE) has …
sequences and handle more complex tasks. Rotary Positional Embedding (RoPE) has …
What is Wrong with Perplexity for Long-context Language Modeling?
Handling long-context inputs is crucial for large language models (LLMs) in tasks such as
extended conversations, document summarization, and many-shot in-context learning …
extended conversations, document summarization, and many-shot in-context learning …
LLMs as Collaborator: Demands-Guided Collaborative Retrieval-Augmented Generation for Commonsense Knowledge-Grounded Open-Domain Dialogue Systems
J Yu, S Wu, J Chen, W Zhou - Findings of the Association for …, 2024 - aclanthology.org
Capturing the unique knowledge demands for each dialogue context plays a crucial role in
commonsense knowledge-grounded response generation. However, current CoT-based …
commonsense knowledge-grounded response generation. However, current CoT-based …
Talec: teach your llm to evaluate in specific domain with in-house criteria by criteria division and zero-shot plus few-shot
K Zhang, S Yuan, H Zhao - arxiv preprint arxiv:2407.10999, 2024 - arxiv.org
With the rapid development of large language models (LLM), the evaluation of LLM
becomes increasingly important. Measuring text generation tasks such as summarization …
becomes increasingly important. Measuring text generation tasks such as summarization …
Base of rope bounds context length
Position embedding is a core component of current Large Language Models (LLMs). Rotary
position embedding (RoPE), a technique that encodes the position information with a …
position embedding (RoPE), a technique that encodes the position information with a …
Forgetting curve: A reliable method for evaluating memorization capability for long-context models
X Liu, R Zhao, P Huang, C **ao, B Li, J Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Numerous recent works target to extend effective context length for language models and
various methods, tasks and benchmarks exist to measure model's effective memorization …
various methods, tasks and benchmarks exist to measure model's effective memorization …
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
AR Basani, X Zhang - arxiv preprint arxiv:2411.14133, 2024 - arxiv.org
Large Language Models (LLMs) have shown impressive proficiency across a range of
natural language processing tasks yet remain vulnerable to adversarial prompts, known as …
natural language processing tasks yet remain vulnerable to adversarial prompts, known as …