A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …
everywhere because of its ability to analyze and create text, images, and beyond. With such …
Vision-language pre-training: Basics, recent advances, and future trends
This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …
intelligence that have been developed in the last few years. We group these approaches …
Llama-adapter v2: Parameter-efficient visual instruction model
How to efficiently transform large language models (LLMs) into instruction followers is
recently a popular research direction, while training LLM for multi-modal reasoning remains …
recently a popular research direction, while training LLM for multi-modal reasoning remains …
A metaverse: Taxonomy, components, applications, and open challenges
SM Park, YG Kim - IEEE access, 2022 - ieeexplore.ieee.org
Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is
based on the social value of Generation Z that online and offline selves are not different …
based on the social value of Generation Z that online and offline selves are not different …
A review on the attention mechanism of deep learning
Attention has arguably become one of the most important concepts in the deep learning
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …
From show to tell: A survey on deep learning-based image captioning
Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …
reason, large research efforts have been devoted to image captioning, ie describing images …
A survey of natural language generation
This article offers a comprehensive review of the research on Natural Language Generation
(NLG) over the past two decades, especially in relation to data-to-text generation and text-to …
(NLG) over the past two decades, especially in relation to data-to-text generation and text-to …
Exploring and distilling posterior and prior knowledge for radiology report generation
Automatically generating radiology reports can improve current clinical practice in diagnostic
radiology. On one hand, it can relieve radiologists from the heavy burden of report writing; …
radiology. On one hand, it can relieve radiologists from the heavy burden of report writing; …
Generating radiology reports via memory-driven transformer
Medical imaging is frequently used in clinical practice and trials for diagnosis and treatment.
Writing imaging reports is time-consuming and can be error-prone for inexperienced …
Writing imaging reports is time-consuming and can be error-prone for inexperienced …
Metransformer: Radiology report generation by transformer with multiple learnable expert tokens
In clinical scenarios, multi-specialist consultation could significantly benefit the diagnosis,
especially for intricate cases. This inspires us to explore a" multi-expert joint diagnosis" …
especially for intricate cases. This inspires us to explore a" multi-expert joint diagnosis" …