Google 학술 검색

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

저장 인용 741회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Advancing transformer architecture in long-context large language models: A comprehensive survey

Y Huang, J Xu, J Lai, Z Jiang, T Chen, Z Li… - arxiv preprint arxiv …, 2023 - arxiv.org

With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

저장 인용 39회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - arxiv.org

Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vision mamba: Efficient visual representation learning with bidirectional state space model

L Zhu, B Liao, Q Zhang, X Wang, W Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

저장 인용 129회 인용 관련 학술자료 전체 7개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arxiv preprint arxiv …, 2024 - arxiv.org

The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

저장 인용 110회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[HTML] nih.gov

Adapted large language models can outperform medical experts in clinical text summarization

D Van Veen, C Van Uden, L Blankemeier… - Nature medicine, 2024 - nature.com

Analyzing vast textual data and summarizing key information from electronic health records
imposes a substantial burden on how clinicians allocate their time. Although large language …

저장 인용 207회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

In-context autoencoder for context compression in a large language model

T Ge, J Hu, L Wang, X Wang, SQ Chen… - arxiv preprint arxiv …, 2023 - arxiv.org

We propose the In-context Autoencoder (ICAE), leveraging the power of a large language
model (LLM) to compress a long context into short compact memory slots that can be directly …

저장 인용 105회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Lm-infinite: Simple on-the-fly length generalization for large language models

C Han, Q Wang, W **ong, Y Chen, H Ji… - arxiv preprint arxiv …, 2023 - arxiv.org

In recent years, there have been remarkable advancements in the performance of
Transformer-based Large Language Models (LLMs) across various domains. As these LLMs …

저장 인용 106회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Scaling transformer to 1m tokens and beyond with rmt

A Bulatov, Y Kuratov, Y Kapushev… - arxiv preprint arxiv …, 2023 - arxiv.org

A major limitation for the broader scope of problems solvable by transformers is the
quadratic scaling of computational complexity with input size. In this study, we investigate …

저장 인용 87회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Longnet: Scaling transformers to 1,000,000,000 tokens

A comprehensive overview of large language models

Advancing transformer architecture in long-context large language models: A comprehensive survey

Mamba: Linear-time sequence modeling with selective state spaces

Vision mamba: Efficient visual representation learning with bidirectional state space model

Efficient large language models: A survey

RULER: What's the Real Context Size of Your Long-Context Language Models?

Adapted large language models can outperform medical experts in clinical text summarization

In-context autoencoder for context compression in a large language model

Lm-infinite: Simple on-the-fly length generalization for large language models

Scaling transformer to 1m tokens and beyond with rmt