Transforming decoder-only models into encoder-only models with improved understanding capabilities

Z Huang, X Huang, A Wu, X Wang, G Cheng - Knowledge-Based Systems, 2025 - Elsevier
The decoder-only architecture has become a key driver in the current wave of large
language models. However, its reliance on causal (ie, left-side) attention limits its text …

A Simple and Effective Span Interaction Modeling Method for Enhancing Multiple Span Question Answering

Y Zhang, Z Luo, Z Ding - … on Natural Language Processing and Chinese …, 2024 - Springer
Although multi-span question-answering tasks align more closely with the complex
demands of the real world, existing models often struggle to effectively model the …