Google Acadèmic

Y Gao, Y **ong, X Gao, K Jia, J Pan, Y Bi, Y Dai… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …

Desa Cita Citat per 1310 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM Computing …, 2023 - dl.acm.org

Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

Desa Cita Citat per 3320 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arxiv preprint arxiv …, 2023 - arxiv.org

While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

Desa Cita Citat per 927 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Unlimiformer: Long-range transformers with unlimited length input

A Bertsch, U Alon, G Neubig… - Advances in Neural …, 2023 - proceedings.neurips.cc

Since the proposal of transformers, these models have been limited to bounded input
lengths, because of their need to attend to every token in the input. In this work, we propose …

Desa Cita Citat per 124 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Retentive network: A successor to transformer for large language models

Y Sun, L Dong, S Huang, S Ma, Y **a, J Xue… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large
language models, simultaneously achieving training parallelism, low-cost inference, and …

Desa Cita Citat per 311 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring the limits of chatgpt for query or aspect-based text summarization

X Yang, Y Li, X Zhang, H Chen, W Cheng - arxiv preprint arxiv …, 2023 - arxiv.org

Text summarization has been a crucial problem in natural language processing (NLP) for
several decades. It aims to condense lengthy documents into shorter versions while …

Desa Cita Citat per 188 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Megalodon: Efficient llm pretraining and inference with unlimited context length

X Ma, X Yang, W **ong, B Chen, L Yu… - Advances in …, 2025 - proceedings.neurips.cc

The quadratic complexity and weak length extrapolation of Transformers limits their ability to
scale to long sequences, and while sub-quadratic solutions like linear attention and state …

Desa Cita Citat per 27 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Self-critiquing models for assisting human evaluators

W Saunders, C Yeh, J Wu, S Bills, L Ouyang… - arxiv preprint arxiv …, 2022 - arxiv.org

We fine-tune large language models to write natural language critiques (natural language
critical comments) using behavioral cloning. On a topic-based summarization task, critiques …

Desa Cita Citat per 230 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Effective long-context scaling of foundation models

W **ong, J Liu, I Molybog, H Zhang, P Bhargava… - arxiv preprint arxiv …, 2023 - arxiv.org

We present a series of long-context LLMs that support effective context windows of up to
32,768 tokens. Our model series are built through continual pretraining from Llama 2 with …

Desa Cita Citat per 172 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Retrieval meets long context large language models

P Xu, W **, X Wu, L McAfee, C Zhu, Z Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Extending the context window of large language models (LLMs) is getting popular recently,
while the solution of augmenting LLMs with retrieval has existed for years. The natural …

Desa Cita Citat per 132 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

QMSum: A new benchmark for query-based multi-domain meeting summarization

Retrieval-augmented generation for large language models: A survey

Survey of hallucination in natural language generation

Siren's song in the AI ocean: a survey on hallucination in large language models

Unlimiformer: Long-range transformers with unlimited length input

Retentive network: A successor to transformer for large language models

Exploring the limits of chatgpt for query or aspect-based text summarization

Megalodon: Efficient llm pretraining and inference with unlimited context length

Self-critiquing models for assisting human evaluators

Effective long-context scaling of foundation models

Retrieval meets long context large language models