- Academic Search

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024‏ - arxiv.org‏

This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …‏

שמור צטט צוטט על ידי 137 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] wiley.com

Using natural language processing to support peer‐feedback in the age of artificial intelligence: A cross‐disciplinary framework and a research agenda‏

E Bauer, M Greisel, I Kuznetsov… - British Journal of …, 2023‏ - Wiley Online Library‏

Advancements in artificial intelligence are rapidly increasing. The new‐generation large
language models, such as ChatGPT and GPT‐4, bear the potential to transform educational …‏

שמור צטט צוטט על ידי 96 מאמרים בנושא זה כל 16 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient streaming language models with attention sinks‏

G **ao, Y Tian, B Chen, S Han, M Lewis - arxiv preprint arxiv:2309.17453, 2023‏ - arxiv.org‏

Deploying Large Language Models (LLMs) in streaming applications such as multi-round
dialogue, where long interactions are expected, is urgently needed but poses two major …‏

שמור צטט צוטט על ידי 479 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gqa: Training generalized multi-query transformer models from multi-head checkpoints‏

J Ainslie, J Lee-Thorp, M De Jong… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up
decoder inference. However, MQA can lead to quality degradation, and moreover it may not …‏

שמור צטט צוטט על ידי 590 מאמרים בנושא זה כל 4 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ul2: Unifying language learning paradigms‏

Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Existing pre-trained models are generally geared towards a particular class of problems. To
date, there seems to be still no consensus on what the right architecture and pre-training …‏

שמור צטט צוטט על ידי 454 מאמרים בנושא זה כל 4 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Longbench: A bilingual, multitask benchmark for long context understanding‏

Y Bai, X Lv, J Zhang, H Lyu, J Tang, Z Huang… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Although large language models (LLMs) demonstrate impressive performance for many
language tasks, most of them can only handle texts a few thousand tokens long, limiting their …‏

שמור צטט צוטט על ידי 176 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Finetuned language models are zero-shot learners‏

J Wei, M Bosma, VY Zhao, K Guu, AW Yu… - arxiv preprint arxiv …, 2021‏ - arxiv.org‏

This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …‏

שמור צטט צוטט על ידי 3553 מאמרים בנושא זה כל 8 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LongT5: Efficient text-to-text transformer for long sequences‏

M Guo, J Ainslie, D Uthus, S Ontanon, J Ni… - arxiv preprint arxiv …, 2021‏ - arxiv.org‏

Recent work has shown that either (1) increasing the input length or (2) increasing model
size can improve the performance of Transformer-based neural models. In this paper, we …‏

שמור צטט צוטט על ידי 303 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Graph neural networks for natural language processing: A survey‏

L Wu, Y Chen, K Shen, X Guo, H Gao… - … and Trends® in …, 2023‏ - nowpublishers.com‏

Deep learning has become the dominant approach in addressing various tasks in Natural
Language Processing (NLP). Although text inputs are typically represented as a sequence …‏

שמור צטט צוטט על ידי 365 מאמרים בנושא זה כל 5 הגרסאות חיפוש ספריות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models‏

N Ding, Y Qin, G Yang, F Wei, Z Yang, Y Su… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Despite the success, the process of fine-tuning large-scale PLMs brings prohibitive
adaptation costs. In fact, fine-tuning all the parameters of a colossal model and retaining …‏

שמור צטט צוטט על ידי 234 מאמרים בנושא זה כל 7 הגרסאות פתיחה בתור HTML

יצירת התראה

צטט

חיפוש מתקדם

נשמר בספרייה שלי

Multi-news: A large-scale multi-document summarization dataset and abstractive hierarchical model

Datasets for large language models: A comprehensive survey‏

Using natural language processing to support peer‐feedback in the age of artificial intelligence: A cross‐disciplinary framework and a research agenda‏

Efficient streaming language models with attention sinks‏

Gqa: Training generalized multi-query transformer models from multi-head checkpoints‏

Ul2: Unifying language learning paradigms‏

Longbench: A bilingual, multitask benchmark for long context understanding‏

Finetuned language models are zero-shot learners‏

LongT5: Efficient text-to-text transformer for long sequences‏

Graph neural networks for natural language processing: A survey‏

Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models‏