Google Академик

Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org

With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …

Сачувај Цитирај 1248 пута наведен Сродни чланци Све верзије (12)

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

A survey on large language model based autonomous agents

L Wang, C Ma, X Feng, Z Zhang, H Yang… - Frontiers of Computer …, 2024 - Springer

Autonomous agents have long been a research focus in academic and industry
communities. Previous research often focuses on training agents with limited knowledge …

Сачувај Цитирај 1024 пута наведен Сродни чланци Све верзије (7)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Visual instruction tuning

H Liu, C Li, Q Wu, YJ Lee - Advances in neural information …, 2023 - proceedings.neurips.cc

Instruction tuning large language models (LLMs) using machine-generated instruction-
following data has been shown to improve zero-shot capabilities on new tasks, but the idea …

Сачувај Цитирај 5522 пута наведен Сродни чланци Све верзије (18) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Сачувај Цитирај 777 пута наведен Сродни чланци Све верзије (4) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com

Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …

Сачувај Цитирај 405 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

mplug-owl: Modularization empowers large language models with multimodality

Q Ye, H Xu, G Xu, J Ye, M Yan, Y Zhou, J Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated impressive zero-shot abilities on a
variety of open-ended tasks, while recent research has also explored the use of LLMs for …

Сачувај Цитирај 841 пута наведен Сродни чланци Све верзије (2) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Visionllm: Large language model is also an open-ended decoder for vision-centric tasks

W Wang, Z Chen, X Chen, J Wu… - Advances in …, 2023 - proceedings.neurips.cc

Large language models (LLMs) have notably accelerated progress towards artificial general
intelligence (AGI), with their impressive zero-shot capacity for user-tailored tasks, endowing …

Сачувај Цитирај 448 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Instruction tuning for large language models: A survey

S Zhang, L Dong, X Li, S Zhang, X Sun, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

This paper surveys research works in the quickly advancing field of instruction tuning (IT),
which can also be referred to as supervised fine-tuning (SFT)\footnote {In this paper, unless …

Сачувај Цитирај 752 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] stableaiprompts.com

[PDF][PDF] The dawn of lmms: Preliminary explorations with gpt-4v (ision)

Z Yang, L Li, K Lin, J Wang, CC Lin… - arxiv preprint arxiv …, 2023 - stableaiprompts.com

Large multimodal models (LMMs) extend large language models (LLMs) with multi-sensory
skills, such as visual understanding, to achieve stronger generic intelligence. In this paper …

Сачувај Цитирај 597 пута наведен Сродни чланци Све верзије (4) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Video-llava: Learning united visual representation by alignment before projection

B Lin, Y Ye, B Zhu, J Cui, M Ning, P **… - arxiv preprint arxiv …, 2023 - arxiv.org

The Large Vision-Language Model (LVLM) has enhanced the performance of various
downstream tasks in visual-language understanding. Most existing approaches encode …

Сачувај Цитирај 462 пута наведен Сродни чланци Све верзије (3) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Mm-react: Prompting chatgpt for multimodal reasoning and action

A Survey of Multimodel Large Language Models

A survey on large language model based autonomous agents

Visual instruction tuning

A comprehensive overview of large language models

Lisa: Reasoning segmentation via large language model

mplug-owl: Modularization empowers large language models with multimodality

Visionllm: Large language model is also an open-ended decoder for vision-centric tasks

Instruction tuning for large language models: A survey

[PDF][PDF] The dawn of lmms: Preliminary explorations with gpt-4v (ision)

Video-llava: Learning united visual representation by alignment before projection