A review of modern recommender systems using generative models (gen-recsys)
Traditional recommender systems typically use user-item rating histories as their main data
source. However, deep generative models now have the capability to model and sample …
source. However, deep generative models now have the capability to model and sample …
Interpretability research of deep learning: A literature survey
B Xua, G Yang - Information Fusion, 2024 - Elsevier
Deep learning (DL) has been widely used in various fields. However, its black-box nature
limits people's understanding and trust in its decision-making process. Therefore, it becomes …
limits people's understanding and trust in its decision-making process. Therefore, it becomes …
Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting
With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
Brave: Broadening the visual encoding of vision-language models
Vision-language models (VLMs) are typically composed of a vision encoder, eg CLIP, and a
language model (LM) that interprets the encoded features to solve downstream tasks …
language model (LM) that interprets the encoded features to solve downstream tasks …
Llm inference unveiled: Survey and roofline model insights
The field of efficient Large Language Model (LLM) inference is rapidly evolving, presenting a
unique blend of opportunities and challenges. Although the field has expanded and is …
unique blend of opportunities and challenges. Although the field has expanded and is …
BB-GeoGPT: A framework for learning a large language model for geographic information science
Large language models (LLMs) exhibit impressive capabilities across diverse tasks in
natural language processing. Nevertheless, challenges arise such as large model …
natural language processing. Nevertheless, challenges arise such as large model …
A survey of multimodal large language model from a data-centric perspective
Multimodal large language models (MLLMs) enhance the capabilities of standard large
language models by integrating and processing data from multiple modalities, including text …
language models by integrating and processing data from multiple modalities, including text …
Mobilevlm v2: Faster and stronger baseline for vision language model
We introduce MobileVLM V2, a family of significantly improved vision language models
upon MobileVLM, which proves that a delicate orchestration of novel architectural design, an …
upon MobileVLM, which proves that a delicate orchestration of novel architectural design, an …
Facial affective behavior analysis with instruction tuning
Facial affective behavior analysis (FABA) is crucial for understanding human mental states
from images. However, traditional approaches primarily deploy models to discriminate …
from images. However, traditional approaches primarily deploy models to discriminate …
A comprehensive review of multimodal large language models: Performance and challenges across different tasks
In an era defined by the explosive growth of data and rapid technological advancements,
Multimodal Large Language Models (MLLMs) stand at the forefront of artificial intelligence …
Multimodal Large Language Models (MLLMs) stand at the forefront of artificial intelligence …