The rise and potential of large language model based agents: A survey
Z ** language-image pre-training with frozen image encoders and large language models
The cost of vision-and-language pre-training has become increasingly prohibitive due to
end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and …
end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and …
Minigpt-v2: large language model as a unified interface for vision-language multi-task learning
Large language models have shown their remarkable capabilities as a general interface for
various language-related applications. Motivated by this, we target to build a unified …
various language-related applications. Motivated by this, we target to build a unified …
Embodiedgpt: Vision-language pre-training via embodied chain of thought
Embodied AI is a crucial frontier in robotics, capable of planning and executing action
sequences for robots to accomplish long-horizon tasks in physical environments. In this …
sequences for robots to accomplish long-horizon tasks in physical environments. In this …
[HTML][HTML] A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics
The utilization of large language models (LLMs) for Healthcare has generated both
excitement and concern due to their ability to effectively respond to free-text queries with …
excitement and concern due to their ability to effectively respond to free-text queries with …
Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Unleashing the power of edge-cloud generative ai in mobile networks: A survey of aigc services
Artificial Intelligence-Generated Content (AIGC) is an automated method for generating,
manipulating, and modifying valuable and diverse data using AI algorithms creatively. This …
manipulating, and modifying valuable and diverse data using AI algorithms creatively. This …