Consumer segmentation with large language models
Y Li, Y Liu, M Yu - Journal of Retailing and Consumer Services, 2025 - Elsevier
Consumer segmentation is vital for companies to customize their offerings effectively. Our
study explores the application of Large Language Models (LLMs) in marketing research for …
study explores the application of Large Language Models (LLMs) in marketing research for …
Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning
Fraud datasets often times lack consistent and accurate labels, and are characterized by
having high class imbalance where the number of fraudulent examples are far fewer than …
having high class imbalance where the number of fraudulent examples are far fewer than …
Fraud detection in healthcare claims using machine learning: A systematic review
Objective: Identifying fraud in healthcare programs is crucial, as an estimated 3%–10% of
the total healthcare expenditures are lost to fraudulent activities. This study presents a …
the total healthcare expenditures are lost to fraudulent activities. This study presents a …
Medical provider embeddings for healthcare fraud detection
Advances in data mining and machine learning continue to transform the healthcare industry
and provide value to medical professionals and patients. In this study, we address the …
and provide value to medical professionals and patients. In this study, we address the …
Leveraging lightgbm for categorical big data
LightGBM is a popular Gradient Boosted Decision Tree implementation for classification and
regression tasks. Our contribution is to answer a research question regarding LightGBM. We …
regression tasks. Our contribution is to answer a research question regarding LightGBM. We …
Encoding high-dimensional procedure codes for healthcare fraud detection
Abstract Machine learning applications for healthcare are resha** the industry with new
tools and services designed to improve the quality of patient care. A challenge common to …
tools and services designed to improve the quality of patient care. A challenge common to …
Categorical feature encoding techniques for improved classifier performance when dealing with imbalanced data of fraudulent transactions
Fraudulent transaction data tend to have several categorical features with high cardinality. It
makes data preprocessing complicated if categories in such features do not have an order …
makes data preprocessing complicated if categories in such features do not have an order …
Encoding techniques for high-cardinality features and ensemble learners
This study evaluates the classification performance of five encoding techniques for high-
cardinality categorical features. Encoding techniques are tested using popular bagging and …
cardinality categorical features. Encoding techniques are tested using popular bagging and …
Robust thresholding strategies for highly imbalanced and noisy data
Many studies have shown that non-default decision thresholds are required to maximize
classification performance on highly imbalanced data sets. Thresholding strategies include …
classification performance on highly imbalanced data sets. Thresholding strategies include …
Exploring maximum tree depth and random undersampling in ensemble trees to optimize the classification of imbalanced big data
JT Hancock III, TM Khoshgoftaar - SN Computer Science, 2023 - Springer
We present findings from experiments in Medicare fraud detection, that are the result of
research on two new, publicly available datasets. In this research, we employ popular, open …
research on two new, publicly available datasets. In this research, we employ popular, open …