Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A Survey of Multimodel Large Language Models
Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org
With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …
including vision, the technology of large language models is evolving from a single modality …
The rise and potential of large language model based agents: A survey
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …
human intelligence. AI agents, which are artificial entities capable of sensing the …
[PDF][PDF] A survey of large language models
Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …
of language intelligence by machine. Language is essentially a complex, intricate system of …
Mmbench: Is your multi-modal model an all-around player?
Large vision-language models (VLMs) have recently achieved remarkable progress,
exhibiting impressive multimodal perception and reasoning abilities. However, effectively …
exhibiting impressive multimodal perception and reasoning abilities. However, effectively …
Siren's song in the AI ocean: a survey on hallucination in large language models
While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …
range of downstream tasks, a significant concern revolves around their propensity to exhibit …
Next-gpt: Any-to-any multimodal llm
While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …
Video-llama: An instruction-tuned audio-visual language model for video understanding
We present Video-LLaMA a multi-modal framework that empowers Large Language Models
(LLMs) with the capability of understanding both visual and auditory content in the video …
(LLMs) with the capability of understanding both visual and auditory content in the video …
Multimodal foundation models: From specialists to general-purpose assistants
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …
methods to data compression. Recent advances in statistical machine learning have opened …
Video-chatgpt: Towards detailed video understanding via large vision and language models
Conversation agents fueled by Large Language Models (LLMs) are providing a new way to
interact with visual data. While there have been initial attempts for image-based …
interact with visual data. While there have been initial attempts for image-based …
Generative multimodal models are in-context learners
Humans can easily solve multimodal tasks in context with only a few demonstrations or
simple instructions which current multimodal systems largely struggle to imitate. In this work …
simple instructions which current multimodal systems largely struggle to imitate. In this work …