One or two things we know about concept drift—a survey on monitoring in evolving environments. Part A: detecting concept drift

F Hinder, V Vaquet, B Hammer - Frontiers in Artificial Intelligence, 2024‏ - frontiersin.org
The world surrounding us is subject to constant change. These changes, frequently
described as concept drift, influence many industrial and technical processes. As they can …

Survey: Exploiting data redundancy for optimization of deep learning

JA Chen, W Niu, B Ren, Y Wang, X Shen - ACM Computing Surveys, 2023‏ - dl.acm.org
Data redundancy is ubiquitous in the inputs and intermediate results of Deep Neural
Networks (DNN). It offers many significant opportunities for improving DNN performance and …

Revisiting classifier two-sample tests

D Lopez-Paz, M Oquab - arxiv preprint arxiv:1610.06545, 2016‏ - arxiv.org
The goal of two-sample tests is to assess whether two samples, $ S_P\sim P^ n $ and $
S_Q\sim Q^ m $, are drawn from the same distribution. Perhaps intriguingly, one relatively …

Efficient estimation of mutual information for strongly dependent variables

S Gao, G Ver Steeg, A Galstyan - Artificial intelligence and …, 2015‏ - proceedings.mlr.press
We demonstrate that a popular class of non-parametric mutual information (MI) estimators
based on k-nearest-neighbor graphs requires number of samples that scales exponentially …

Quantifying causal influences

D Janzing, D Balduzzi, M Grosse-Wentrup, B Schölkopf - 2013‏ - projecteuclid.org
Quantifying causal influences Page 1 The Annals of Statistics 2013, Vol. 41, No. 5, 2324–2358
DOI: 10.1214/13-AOS1145 © Institute of Mathematical Statistics, 2013 QUANTIFYING CAUSAL …

Demystifying Fixed -Nearest Neighbor Information Estimators

W Gao, S Oh, P Viswanath - IEEE Transactions on Information …, 2018‏ - ieeexplore.ieee.org
Estimating mutual information from independent identically distributed samples drawn from
an unknown joint density function is a basic statistical problem of broad interest with …

Estimating the directed information to infer causal relationships in ensemble neural spike train recordings

CJ Quinn, TP Coleman, N Kiyavash… - Journal of computational …, 2011‏ - Springer
Advances in recording technologies have given neuroscience researchers access to large
amounts of data, in particular, simultaneous, individual recordings of large groups of …

Model scale versus domain knowledge in statistical forecasting of chaotic systems

W Gilpin - Physical Review Research, 2023‏ - APS
Chaos and unpredictability are traditionally synonymous, yet large-scale machine-learning
methods recently have demonstrated a surprising ability to forecast chaotic systems well …

[PDF][PDF] Summarization based on embedding distributions

H Kobayashi, M Noguchi, T Yatsuka - Proceedings of the 2015 …, 2015‏ - aclanthology.org
In this study, we consider a summarization method using the document level similarity based
on embeddings, or distributed representations of words, where we assume that an …

As if sand were stone. New concepts and metrics to probe the ground on which to build trustable AI

F Cabitza, A Campagner, LM Sconfienza - BMC Medical Informatics and …, 2020‏ - Springer
Background We focus on the importance of interpreting the quality of the labeling used as
the input of predictive models to understand the reliability of their output in support of human …