Unsupervised learning methods for molecular simulation data

A Glielmo, BE Husic, A Rodriguez, C Clementi… - Chemical …, 2021 - ACS Publications
Unsupervised learning is becoming an essential tool to analyze the increasingly large
amounts of data produced by atomistic and molecular simulations, in material science, solid …

Neural decoding of EEG signals with machine learning: a systematic review

M Saeidi, W Karwowski, FV Farahani, K Fiok, R Taiar… - Brain sciences, 2021 - mdpi.com
Electroencephalography (EEG) is a non-invasive technique used to record the brain's
evoked and induced electrical activity from the scalp. Artificial intelligence, particularly …

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

M Grootendorst - arxiv preprint arxiv:2203.05794, 2022 - arxiv.org
Topic models can be useful tools to discover latent topics in collections of documents.
Recent studies have shown the feasibility of approach topic modeling as a clustering task …

Water quality prediction using machine learning models based on grid search method

MY Shams, AM Elshewey, ESM El-Kenawy… - Multimedia Tools and …, 2024 - Springer
Water quality is very dominant for humans, animals, plants, industries, and the environment.
In the last decades, the quality of water has been impacted by contamination and pollution …

The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix …

D Chicco, N Tötsch, G Jurman - BioData mining, 2021 - Springer
Evaluating binary classifications is a pivotal task in statistics and machine learning, because
it can influence decisions in multiple areas, including for example prognosis or therapies of …

Survey of vector database management systems

JJ Pan, J Wang, G Li - The VLDB Journal, 2024 - Springer
There are now over 20 commercial vector database management systems (VDBMSs), all
produced within the past five years. But embedding-based retrieval has been studied for …

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

D Chicco, G Jurman - BMC genomics, 2020 - Springer
Background To evaluate binary classifications and their confusion matrices, scientific
researchers can employ several statistical rates, accordingly to the goal of the experiment …

Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications

RK Halder, MN Uddin, MA Uddin, S Aryal, A Khraisat - Journal of Big Data, 2024 - Springer
Abstract The k-Nearest Neighbors (kNN) method, established in 1951, has since evolved
into a pivotal tool in data mining, recommendation systems, and Internet of Things (IoT) …

Outlier detection: Methods, models, and classification

A Boukerche, L Zheng, O Alfandi - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Over the past decade, we have witnessed an enormous amount of research effort dedicated
to the design of efficient outlier detection techniques while taking into consideration …

A comprehensive survey of anomaly detection techniques for high dimensional big data

S Thudumu, P Branch, J **, J Singh - Journal of big data, 2020 - Springer
Anomaly detection in high dimensional data is becoming a fundamental research problem
that has various applications in the real world. However, many existing anomaly detection …