Recent advances in neural text generation: A task-agnostic survey

C Tang, F Guerin, C Lin - arxiv preprint arxiv:2203.03047, 2022‏ - arxiv.org
In recent years, considerable research has been dedicated to the application of neural
models in the field of natural language generation (NLG). The primary objective is to …

Unlocking the heterogeneous landscape of big data NLP with DUUI

A Leonhardt, G Abrami, D Baumartz… - Findings of the …, 2023‏ - aclanthology.org
Automatic analysis of large corpora is a complex task, especially in terms of time efficiency.
This complexity is increased by the fact that flexible, extensible text analysis requires the …

Datalab: A platform for data analysis and intervention

Y **ao, J Fu, W Yuan, V Viswanathan, Z Liu… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Despite data's crucial role in machine learning, most existing tools and research tend to
focus on systems on top of existing data rather than how to interpret and manipulate data. In …

[PDF][PDF] Analysis of QA system behavior against context and question changes.

R Karra, A Lasfar - Int. Arab J. Inf. Technol., 2024‏ - iajit.org
Data quality has gained increasing attention across various research domains, including
pattern recognition, image processing, and Natural Language Processing (NLP). The goal of …

[PDF][PDF] Impact of Data Quality on Question Answering System Performances.

R Karra, A Lasfar - Intelligent Automation & Soft Computing, 2023‏ - academia.edu
In contrast with the research of new models, little attention has been paid to the impact of low
or high-quality data feeding a dialogue system. The present paper makes the first attempt to …

[PDF][PDF] Knowledge Enhanced Natural Language Generation

C Tang - 2024‏ - openresearch.surrey.ac.uk
The studies in this thesis aim to address overarching challenges persisting across Natural
Language Generation (NLG) applications, including (1) Limitations in effectively …

A Data-centric Framework for Improving Domain-specific Machine Reading Comprehension Datasets

I Bojic, J Halim, V Suharman, S Tar, QC Ong… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Low-quality data can cause downstream problems in high-stakes applications. Data-centric
approach emphasizes on improving dataset quality to enhance model performance. High …

DataCI: A Platform for Data-Centric AI on Streaming Data

H Zhang, Y Huang, Y Li - arxiv preprint arxiv:2306.15538, 2023‏ - arxiv.org
We introduce DataCI, a comprehensive open-source platform designed specifically for data-
centric AI in dynamic streaming data settings. DataCI provides 1) an infrastructure with rich …

[ספר][B] Information Extraction from Unstructured Big Data: A Case Study of Deep Natural Language Processing in Fintech

B Dash - 2022‏ - search.proquest.com
In today's digital age, enterprises generate a large amount of data as part of their daily
operations. This data is stored in structured, semi-structured, and unstructured formats on …

Cross-document event identity via dense annotation

A Pratapa, Z Liu, K Hasegawa, L Li… - arxiv preprint arxiv …, 2021‏ - arxiv.org
In this paper, we study the identity of textual events from different documents. While the
complex nature of event identity is previously studied (Hovy et al., 2013), the case of events …