- Academic Search

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Speichern Zitieren Zitiert von: 700 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Llamafactory: Unified efficient fine-tuning of 100+ language models

Y Zheng, R Zhang, J Zhang, Y Ye, Z Luo… - arxiv preprint arxiv …, 2024 - arxiv.org

Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks.
However, it requires non-trivial efforts to implement these methods on different models. We …

Speichern Zitieren Zitiert von: 257 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Smaller, weaker, yet better: Training llm reasoners via compute-optimal sampling

H Bansal, A Hosseini, R Agarwal, VQ Tran… - arxiv preprint arxiv …, 2024 - arxiv.org

Training on high-quality synthetic data from strong language models (LMs) is a common
strategy to improve the reasoning performance of LMs. In this work, we revisit whether this …

Speichern Zitieren Zitiert von: 21 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] acm.org

Recranker: Instruction tuning large language model as ranker for top-k recommendation

S Luo, B He, H Zhao, W Shao, Y Qi, Y Huang… - ACM Transactions on …, 2024 - dl.acm.org

Large Language Models (LLMs) have demonstrated remarkable capabilities and have been
extensively deployed across various domains, including recommender systems. Prior …

Speichern Zitieren Zitiert von: 22 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] radensa.ru

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

F Wang, Z Zhang, X Zhang, Z Wu, T Mo, Q Lu… - arxiv preprint arxiv …, 2024 - ai.radensa.ru

Large language models (LLM) have demonstrated emergent abilities in text generation,
question answering, and reasoning, facilitating various tasks and domains. Despite their …

Speichern Zitieren Zitiert von: 4 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Deepseek-vl2: Mixture-of-experts vision-language models for advanced multimodal understanding

Z Wu, X Chen, Z Pan, X Liu, W Liu, D Dai… - arxiv preprint arxiv …, 2024 - arxiv.org

We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-
Language Models that significantly improves upon its predecessor, DeepSeek-VL, through …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Tablegpt2: A large multimodal model with tabular data integration

A Su, A Wang, C Ye, C Zhou, G Zhang, G Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org

The emergence of models like GPTs, Claude, LLaMA, and Qwen has reshaped AI
applications, presenting vast new opportunities across industries. Yet, the integration of …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Q Ren, H Li, D Liu, Z **e, X Lu, Y Qiao, L Sha… - arxiv preprint arxiv …, 2024 - arxiv.org

This study exposes the safety vulnerabilities of Large Language Models (LLMs) in multi-turn
interactions, where malicious users can obscure harmful intents across several queries. We …

Speichern Zitieren Zitiert von: 5 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] aclanthology.org

Memorize step by step: Efficient long-context prefilling with incremental memory and decremental chunk

Z Zeng, Q Guo, X Liu, Z Yin, W Shu… - Proceedings of the …, 2024 - aclanthology.org

Abstract The evolution of Large Language Models (LLMs) has led to significant
advancements, with models like Claude and Gemini capable of processing contexts up to 1 …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

W An, X Bi, G Chen, S Chen, C Deng… - … Conference for High …, 2024 - ieeexplore.ieee.org

The rapid progress in Deep Learning (DL) and Large Language Models (LLMs) has
exponentially increased demands of computational power and bandwidth. This, combined …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 4 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model

A comprehensive overview of large language models

Llamafactory: Unified efficient fine-tuning of 100+ language models

Smaller, weaker, yet better: Training llm reasoners via compute-optimal sampling

Recranker: Instruction tuning large language model as ranker for top-k recommendation

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

Deepseek-vl2: Mixture-of-experts vision-language models for advanced multimodal understanding

Tablegpt2: A large multimodal model with tabular data integration

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Memorize step by step: Efficient long-context prefilling with incremental memory and decremental chunk

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning