Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features

F Pargent, F Pfisterer, J Thomas, B Bischl - Computational Statistics, 2022 - Springer
Since most machine learning (ML) algorithms are designed for numerical inputs, efficiently
encoding categorical variables is a crucial aspect in data analysis. A common problem are …

[PDF][PDF] A benchmark experiment on how to encode categorical features in predictive modeling

F Pargent, B Bischl, J Thomas - München: Ludwig-Maximilians …, 2019 - files.de-1.osf.io
In predictive modeling, high cardinality features (ie unordered categorical predictor variables
with a high number of levels) often pose problems, as most supervised machine learning …

Machine Learning Methods for the Detection of Fraudulent Insurance Claims

S Zhao - 2020 - spectrum.library.concordia.ca
This thesis focuses on automotive fraudulent claims detection, a particular Property and
Casualty (P&C) insurance product. By analyzing the customer's information, we try to define …