[PDF][PDF] MMVQA: A comprehensive dataset for investigating multipage multimodal information retrieval in pdf-based visual question answering
Abstract Document Question Answering (QA) presents a challenge in understanding visually-
rich documents (VRD), particularly with lengthy textual content. Existing studies primarily …
rich documents (VRD), particularly with lengthy textual content. Existing studies primarily …
Large Language Models in Finance (FinLLMs)
J Lee, N Stevens, SC Han - Neural Computing and Applications, 2025 - Springer
Large language models (LLMs) have demonstrated remarkable capabilities and have
attracted significant attention across diverse domains, including financial services. Despite …
attracted significant attention across diverse domains, including financial services. Despite …
StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Text-rich images have significant and extensive value, deeply integrated into various
aspects of human life. Notably, both visual cues and linguistic symbols in text-rich images …
aspects of human life. Notably, both visual cues and linguistic symbols in text-rich images …
AiBAT: Artificial Intelligence/Instructions for Build, Assembly, and Test
Instructions for Build, Assembly, and Test (IBAT) refers to the process used whenever any
operation is conducted on hardware, including tests, assembly, and maintenance. Currently …
operation is conducted on hardware, including tests, assembly, and maintenance. Currently …
MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering
Document Question Answering (QA) presents a challenge in understanding visually-rich
documents (VRD), particularly those dominated by lengthy textual content like research …
documents (VRD), particularly those dominated by lengthy textual content like research …
KVP10k: A Comprehensive Dataset for Key-Value Pair Extraction in Business Documents
In recent years, the challenge of extracting information from business documents has
emerged as a critical task, finding applications across numerous domains. This effort has …
emerged as a critical task, finding applications across numerous domains. This effort has …
DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights
Visually-Rich Documents (VRDs), encompassing elements like charts, tables, and
references, convey complex information across various fields. However, extracting …
references, convey complex information across various fields. However, extracting …
Visually Rich Document Understanding and Intelligence
Y Ding - 2024 - ses.library.usyd.edu.au
Visually Rich Documents (VRDs) are potent carriers of multimodal information widely used
in academia, finance, medical fields, and marketing. Traditional approaches to extracting …
in academia, finance, medical fields, and marketing. Traditional approaches to extracting …
Natural Language Processing in Finance: Applications and Opportunities.
J Lee - 2024 - ses.library.usyd.edu.au
The research of Natural Language Processing (NLP) in Finance has experienced
considerable development driven by academia and industry. However, small benchmark …
considerable development driven by academia and industry. However, small benchmark …