Google Académico

B Peng, K Chen, M Li, P Feng, Z Bi, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) demonstrate impressive capabilities across various fields,
yet their increasing use raises critical security concerns. This article reviews recent literature …

Guardar Citar Citado por 15 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Chat-univi: Unified visual representation empowers large language models with image and video understanding

P **, R Takanobu, W Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …

Guardar Citar Citado por 170 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Languagebind: Extending video-language pretraining to n-modality by language-based semantic alignment

B Zhu, B Lin, M Ning, Y Yan, J Cui, HF Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

The video-language (VL) pretraining has achieved remarkable improvement in multiple
downstream tasks. However, the current VL pretraining framework is hard to extend to …

Guardar Citar Citado por 156 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unifiedmllm: Enabling unified representation for multi-modal multi-tasks with large language model

Z Li, W Wang, YQ Cai, X Qi, P Wang, D Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Significant advancements has recently been achieved in the field of multi-modal large
language models (MLLMs), demonstrating their remarkable capabilities in understanding …

Guardar Citar Citado por 6 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Image segmentation in foundation model era: A survey

T Zhou, F Zhang, B Chang, W Wang, Y Yuan… - arxiv preprint arxiv …, 2024 - arxiv.org

Image segmentation is a long-standing challenge in computer vision, studied continuously
over several decades, as evidenced by seminal algorithms such as N-Cut, FCN, and …

Guardar Citar Citado por 5 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

[PDF][PDF] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding Supplementary Material

P **, RTW Zhang, X Cao, L Yuan - openaccess.thecvf.com

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and
Video Understanding Supplementary Materi Page 1 Chat-UniVi: Unified Visual Representation …

Guardar Citar Citado por 2 Artículos relacionados Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

LLMBind: A unified modality-task integration framework

Securing large language models: Addressing bias, misinformation, and prompt attacks

Chat-univi: Unified visual representation empowers large language models with image and video understanding

Languagebind: Extending video-language pretraining to n-modality by language-based semantic alignment

Unifiedmllm: Enabling unified representation for multi-modal multi-tasks with large language model

Image segmentation in foundation model era: A survey

[PDF][PDF] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding Supplementary Material