Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text
Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …
but improved evaluation approaches are rarely widely adopted. This issue has become …
Using natural language processing to support peer‐feedback in the age of artificial intelligence: A cross‐disciplinary framework and a research agenda
Advancements in artificial intelligence are rapidly increasing. The new‐generation large
language models, such as ChatGPT and GPT‐4, bear the potential to transform educational …
language models, such as ChatGPT and GPT‐4, bear the potential to transform educational …
Data governance in the age of large-scale data-driven language technology
The recent emergence and adoption of Machine Learning technology, and specifically of
Large Language Models, has drawn attention to the need for systematic and transparent …
Large Language Models, has drawn attention to the need for systematic and transparent …
One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia
NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …
by underrepresented languages and dialects. Focusing on the languages spoken in …
Multi 3 WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Creating high-quality annotated data for task-oriented dialog (ToD) is known to be
notoriously difficult, and the challenges are amplified when the goal is to create equitable …
notoriously difficult, and the challenges are amplified when the goal is to create equitable …
[PDF][PDF] An analysis of data quality requirements for machine learning development pipelines frameworks
S Rangineni - International Journal of Computer Trends and …, 2023 - researchgate.net
The importance of meeting data quality standards in the context of Machine Learning (ML)
development pipelines is explored in this study. It delves deep into why good data is crucial …
development pipelines is explored in this study. It delves deep into why good data is crucial …
Handling and presenting harmful text in NLP research
Text data can pose a risk of harm. However, the risks are not fully understood, and how to
handle, present, and discuss harmful text in a safe way remains an unresolved issue in the …
handle, present, and discuss harmful text in a safe way remains an unresolved issue in the …
A survey of data quality requirements that matter in ML development pipelines
The fitness of the systems in which Machine Learning (ML) is used depends greatly on good-
quality data. Specifications on what makes a good-quality dataset have traditionally been …
quality data. Specifications on what makes a good-quality dataset have traditionally been …
REFORMS: Consensus-based Recommendations for Machine-learning-based Science
Machine learning (ML) methods are proliferating in scientific research. However, the
adoption of these methods has been accompanied by failures of validity, reproducibility, and …
adoption of these methods has been accompanied by failures of validity, reproducibility, and …
Quantifying the dialect gap and its correlates across languages
Historically, researchers and consumers have noticed a decrease in quality when applying
NLP tools to minority variants of languages (ie Puerto Rican Spanish or Swiss German), but …
NLP tools to minority variants of languages (ie Puerto Rican Spanish or Swiss German), but …