Data oversampling and imbalanced datasets: An investigation of performance for machine learning and feature engineering

M Mujahid, E Kına, F Rustam, MG Villar, ES Alvarado… - Journal of Big Data, 2024 - Springer
The classification of imbalanced datasets is a prominent task in text mining and machine
learning. The number of samples in each class is not uniformly distributed; one class …

[HTML][HTML] On the quality of synthetic generated tabular data

E Espinosa, A Figueira - Mathematics, 2023 - mdpi.com
Class imbalance is a common issue while develo** classification models. In order to
tackle this problem, synthetic data have recently been developed to enhance the minority …

A review on machine learning aided multi-omics data integration techniques for healthcare

H Bansal, H Luthra, SR Raghuram - Data Analytics and Computational …, 2023 - Springer
To understand the mechanism of biological processes inside a human, it is necessary to
look at its various regulatory aspects, such as DNA methylation and post-translational …

FAIL: Analyzing Software Failures from the News Using LLMs

D Anandayuvaraj, M Campbell, A Tewari… - Proceedings of the 39th …, 2024 - dl.acm.org
Software failures inform engineering work, standards, regulations. For example, the Log4J
vulnerability brought government and industry attention to evaluating and securing software …

Deep Learning in Palmprint Recognition-A Comprehensive Survey

C Gao, Z Yang, W Jia, L Leng, B Zhang… - arxiv preprint arxiv …, 2025 - arxiv.org
Palmprint recognition has emerged as a prominent biometric technology, widely applied in
diverse scenarios. Traditional handcrafted methods for palmprint recognition often fall short …

Towards autonomous cybersecurity: A comparative analysis of agnostic and hybrid AI approaches for advanced persistent threat detection

A Hernández-Rivas, V Morales-Rocha… - … Applications of Artificial …, 2024 - Springer
The rapid evolution of cyber threats requires proactive and automated detection
mechanisms. Although machine learning shows potential in this area, current models …

[HTML][HTML] Analysis of the performance of machine learning models in predicting the severity level of large-truck crashes

J Liu, Y Qi, J Tao, T Tao - Future transportation, 2022 - mdpi.com
Large-truck crashes often result in substantial economic and social costs. Accurate
prediction of the severity level of a reported truck crash can help rescue teams and …

A Comprehensive Survey on Imbalanced Data Learning

X Gao, D **e, Y Zhang, Z Wang, C He, H Yin… - arxiv preprint arxiv …, 2025 - arxiv.org
With the expansion of data availability, machine learning (ML) has achieved remarkable
breakthroughs in both academia and industry. However, imbalanced data distributions are …

Empirical study of machine learning for intelligent bearing fault diagnosis

A Moghadam, FD Kakhki - … Management, Manufacturing, and …, 2023 - taylorfrancis.com
This study explores a machine learning (ML)-based fault detection and classification
approach in induction motors, investigating the impact of various data preparation and …

Learning of conversational systems based on linguistic data summarization applications in BIM environments

YO Vasconcelo Mir, I Pérez Pupo… - Data Analytics and …, 2023 - Springer
In this work, the authors identified opportunities for improvements in conversational systems.
In order to solve the conversational systems learning problems, this investigation proposes a …