Retrieval-augmented generation for natural language processing: A survey

S Wu, Y **ong, Y Cui, H Wu, C Chen, Y Yuan… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated great success in various fields,
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …

A survey on long text modeling with transformers

Z Dong, T Tang, L Li, WX Zhao - arxiv preprint arxiv:2302.14502, 2023 - arxiv.org
Modeling long texts has been an essential technique in the field of natural language
processing (NLP). With the ever-growing number of long documents, it is important to …

Text matching improves sequential recommendation by reducing popularity biases

Z Liu, S Mei, C **ong, X Li, S Yu, Z Liu, Y Gu… - Proceedings of the 32nd …, 2023 - dl.acm.org
This paper proposes Text mAtching based SequenTial rEcommenda-tion model (TASTE),
which maps items and users in an embedding space and recommends items by matching …

End-to-end spoken conversational question answering: Task, dataset and model

C You, N Chen, F Liu, S Ge, X Wu, Y Zou - arxiv preprint arxiv:2204.14272, 2022 - arxiv.org
In spoken question answering, the systems are designed to answer questions from
contiguous text spans within the related speech transcripts. However, the most natural way …

[HTML][HTML] LogPrécis: Unleashing language models for automated malicious log analysis: Précis: A concise summary of essential points, statements, or facts

M Boffa, I Drago, M Mellia, L Vassio, D Giordano… - Computers & …, 2024 - Elsevier
Security logs are the key to understanding attacks and diagnosing vulnerabilities. Often
coming in the form of text logs, their analysis remains a daunting challenge. Language …

Towards data distillation for end-to-end spoken conversational question answering

C You, N Chen, F Liu, D Yang, Y Zou - arxiv preprint arxiv:2010.08923, 2020 - arxiv.org
In spoken question answering, QA systems are designed to answer questions from
contiguous text spans within the related speech transcripts. However, the most natural way …

Information extraction from lengthy legal contracts: leveraging query-based summarization and GPT-3.5

MM Zin, HT Nguyen, K Satoh… - Legal Knowledge and …, 2023 - ebooks.iospress.nl
In the legal domain, extracting information from contracts poses significant challenges,
primarily due to the scarcity of annotated data. In such situations, leveraging large language …

Capturing global structural information in long document question answering with compressive graph selector network

Y Nie, H Huang, W Wei, XL Mao - arxiv preprint arxiv:2210.05499, 2022 - arxiv.org
Long document question answering is a challenging task due to its demands for complex
reasoning over long text. Previous works usually take long documents as non-structured flat …

RoR: Read-over-read for long document machine reading comprehension

J Zhao, J Bao, Y Wang, Y Zhou, Y Wu, X He… - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pre-trained models, such as BERT, have achieved remarkable results on
machine reading comprehension. However, due to the constraint of encoding length (eg …

Drilling down into the discourse structure with llms for long document question answering

I Nair, S Somasundaram, A Saxena… - arxiv preprint arxiv …, 2023 - arxiv.org
We address the task of evidence retrieval for long document question answering, which
involves locating relevant paragraphs within a document to answer a question. We aim to …