Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Mm-llms: Recent advances in multimodal large language models
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …
Video-llava: Learning united visual representation by alignment before projection
The Large Vision-Language Model (LVLM) has enhanced the performance of various
downstream tasks in visual-language understanding. Most existing approaches encode …
downstream tasks in visual-language understanding. Most existing approaches encode …
Llamafactory: Unified efficient fine-tuning of 100+ language models
Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks.
However, it requires non-trivial efforts to implement these methods on different models. We …
However, it requires non-trivial efforts to implement these methods on different models. We …
Internvideo2: Scaling foundation models for multimodal video understanding
We introduce InternVideo2, a new family of video foundation models (ViFM) that achieve the
state-of-the-art results in video recognition, video-text tasks, and video-centric dialogue. Our …
state-of-the-art results in video recognition, video-text tasks, and video-centric dialogue. Our …
Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks
The exponential growth of large language models (LLMs) has opened up numerous
possibilities for multi-modal AGI systems. However the progress in vision and vision …
possibilities for multi-modal AGI systems. However the progress in vision and vision …
Chat-univi: Unified visual representation empowers large language models with image and video understanding
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …
range of open-ended tasks and have extended their utility to encompass multimodal …
Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting
With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
Video understanding with large language models: A survey
With the burgeoning growth of online video platforms and the escalating volume of video
content, the demand for proficient video understanding tools has intensified markedly. Given …
content, the demand for proficient video understanding tools has intensified markedly. Given …
Worldgpt: Empowering llm as multimodal world model
World models are progressively being employed across diverse fields, extending from basic
environment simulation to complex scenario construction. However, existing models are …
environment simulation to complex scenario construction. However, existing models are …
When large language model agents meet 6G networks: Perception, grounding, and alignment
AI agents based on multimodal large language models (LLMs) are expected to revolutionize
human-computer interaction, and offer more personalized assistant services across various …
human-computer interaction, and offer more personalized assistant services across various …