Event Stream GPT: a data pre-processing and modeling library for generative, pre-trained transformers over continuous-time sequences of complex events

M McDermott, B Nestor, P Argaw… - Advances in Neural …, 2023 - proceedings.neurips.cc
Generative, pre-trained transformers (GPTs, a type of" Foundation Models") have reshaped
natural language processing (NLP) through their versatility in diverse downstream tasks …

Yet another ICU benchmark: A flexible multi-center framework for clinical ML

R Van De Water, H Schmidt, P Elbers, P Thoral… - arxiv preprint arxiv …, 2023 - arxiv.org
Medical applications of machine learning (ML) have experienced a surge in popularity in
recent years. The intensive care unit (ICU) is a natural habitat for ML given the abundance of …

HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting?

I Karpukhin, F Shipilov, A Savchenko - arxiv preprint arxiv:2406.14341, 2024 - arxiv.org
Accurately forecasting multiple future events within a given time horizon is crucial for
finance, retail, social networks, and healthcare applications. Event timing and labels are …

MEDS-Tab: Automated tabularization and baseline methods for MEDS datasets

N Oufattole, T Bergamaschi, A Kolo, H Jeong… - arxiv preprint arxiv …, 2024 - arxiv.org
Effective, reliable, and scalable development of machine learning (ML) solutions for
structured electronic health record (EHR) data requires the ability to reliably generate high …

Medical event data standard (MEDS): Facilitating machine learning for health

B Arnrich, E Choi, JA Fries, MBA McDermott… - ICLR 2024 Workshop …, 2024 - openreview.net
We introduce the Medical Event Data Standard (MEDS), a lightweight schema for enabling
machine learning over electronic health record (EHR) data. Unlike common data models …

MF-CLR: multi-frequency contrastive learning representation for time series

J Duan, W Zheng, Y Du, W Wu, H Jiang… - Forty-first International …, 2024 - openreview.net
Learning a decent representation from unlabeled time series is a challenging task,
especially when the time series data is derived from diverse channels at different sampling …

ACES: Automatic Cohort Extraction System for Event-Stream Datasets

J Xu, J Gallifant, AEW Johnson… - arxiv preprint arxiv …, 2024 - arxiv.org
Reproducibility remains a significant challenge in machine learning (ML) for healthcare.
Datasets, model pipelines, and even task/cohort definitions are often private in this field …