Google Tudós

Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org

With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …

Mentés Hivatkozás Idézetek száma: 1224 Kapcsolódó cikkek Mind a(z) 12 változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, J Dong, C Li, D Su, C Chu… - arxiv preprint arxiv …, 2024 - arxiv.org

In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

Mentés Hivatkozás Idézetek száma: 223 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

Mentés Hivatkozás Idézetek száma: 2454 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Cogvlm: Visual expert for pretrained language models

W Wang, Q Lv, W Yu, W Hong, J Qi… - Advances in …, 2025 - proceedings.neurips.cc

We introduce CogVLM, a powerful open-source visual language foundation model. Different
from the popular\emph {shallow alignment} method which maps image features into the …

Mentés Hivatkozás Idézetek száma: 579 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Sigmoid loss for language image pre-training

X Zhai, B Mustafa, A Kolesnikov… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose a simple pairwise sigmoid loss for image-text pre-training. Unlike standard
contrastive learning with softmax normalization, the sigmoid loss operates solely on image …

Mentés Hivatkozás Idézetek száma: 700 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Q Ye, H Xu, J Ye, M Yan, A Hu, H Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Multi-modal Large Language Models (MLLMs) have demonstrated impressive
instruction abilities across various open-ended tasks. However previous methods have …

Mentés Hivatkozás Idézetek száma: 359 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Palm-e: An embodied multimodal language model

D Driess, F **a, MSM Sajjadi, C Lynch, A Chowdhery… - 2023 - openreview.net

Large language models excel at a wide range of complex tasks. However, enabling general
inference in the real world, eg for robotics problems, raises the challenge of grounding. We …

Mentés Hivatkozás Idézetek száma: 1663 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards generalist biomedical AI

T Tu, S Azizi, D Driess, M Schaekermann, M Amin… - Nejm Ai, 2024 - ai.nejm.org

Background Medicine is inherently multimodal, requiring the simultaneous interpretation
and integration of insights between many data modalities spanning text, imaging, genomics …

Mentés Hivatkozás Idézetek száma: 337 Kapcsolódó cikkek Mind a(z) 4 változat

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H **, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org

This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

Mentés Hivatkozás Idézetek száma: 825 Kapcsolódó cikkek Mind a(z) 7 változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Datacomp: In search of the next generation of multimodal datasets

SY Gadre, G Ilharco, A Fang… - Advances in …, 2023 - proceedings.neurips.cc

Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable
Diffusion and GPT-4, yet their design does not receive the same research attention as model …

Mentés Hivatkozás Idézetek száma: 372 Kapcsolódó cikkek Mind a(z) 12 változat HTML-változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Pali: A jointly-scaled multilingual language-image model

A Survey of Multimodel Large Language Models

Mm-llms: Recent advances in multimodal large language models

Qwen technical report

Cogvlm: Visual expert for pretrained language models

Sigmoid loss for language image pre-training

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Palm-e: An embodied multimodal language model

Towards generalist biomedical AI

Harnessing the power of llms in practice: A survey on chatgpt and beyond

Datacomp: In search of the next generation of multimodal datasets