VITALITY 2: Reviewing Academic Literature Using Large Language Models

H An, A Narechania, E Wall, K Xu - arxiv preprint arxiv:2408.13450, 2024 - arxiv.org
Academic literature reviews have traditionally relied on techniques such as keyword
searches and accumulation of relevant back-references, using databases like Google …

Coral: Code representation learning with weakly-supervised transformers for analyzing data analysis

G Zhang, MA Merrill, Y Liu, J Heer, T Althoff - EPJ Data Science, 2022 - epjds.epj.org
Large scale analysis of source code, and in particular scientific source code, holds the
promise of better understanding the data science process, identifying analytical best …

A Survey on Semantics in Automated Data Science

U Khurana, K Srinivas, H Samulowitz - arxiv preprint arxiv:2205.08018, 2022 - arxiv.org
Data Scientists leverage common sense reasoning and domain knowledge to understand
and enrich data for building predictive models. In recent years, we have witnessed a surge …

[PDF][PDF] In-class data analysis replications: Teaching students while testing science

K Gligorić, T Piccardi, JM Hofman… - Harvard Data Science …, 2024 - assets.pubpub.org
Science is facing a reproducibility crisis. Overcoming it would require concerted efforts to
replicate prior studies, but the incentives for researchers are currently weak, as replicating …

Towards AI-Assisted Data Science Development: Decoding, Visualising, and Enhancing Human-AI Collaboration for Data Science Workflows in Practice

D Ramasamy - 2024 - zora.uzh.ch
Data Science as a field has received enormous interest in recent times due to the various
technological advancements combined with advanced computing capabilities. This makes …

[書籍][B] Supporting reliable data analysis by evaluating all reasonable analytic decisions

Y Liu - 2022 - search.proquest.com
Analysts make many, sometimes arbitrary, decisions throughout the data analysis pipeline,
yet different choices can lead to divergent conclusions. The flexibility of making analytic …