Exploration and explanation in computational notebooks

A Rule, A Tabard, JD Hollan - Proceedings of the 2018 CHI Conference …, 2018 - dl.acm.org
Computational notebooks combine code, visualizations, and text in a single document.
Researchers, data analysts, and even journalists are rapidly adopting this new medium. We …

Off-the-shelf deep learning is not enough, and requires parsimony, Bayesianity, and causality

RK Vasudevan, M Ziatdinov, L Vlcek… - npj Computational …, 2021 - nature.com
Deep neural networks ('deep learning') have emerged as a technology of choice to tackle
problems in speech recognition, computer vision, finance, etc. However, adoption of deep …

What's wrong with computational notebooks? Pain points, needs, and design opportunities

S Chattopadhyay, I Prasad, AZ Henley… - Proceedings of the …, 2020 - dl.acm.org
Computational notebooks-such as Azure, Databricks, and Jupyter-are a popular, interactive
paradigm for data scientists to author code, analyze data, and interleave visualizations, all …

The story in the notebook: Exploratory data science using a literate programming tool

MB Kery, M Radensky, M Arya, BE John… - Proceedings of the 2018 …, 2018 - dl.acm.org
Literate programming tools are used by millions of programmers today, and are intended to
facilitate presenting data analyses in the form of a narrative. We interviewed 21 data …

PyTerrier: Declarative experimentation in Python from BM25 to dense retrieval

C Macdonald, N Tonellotto, S MacAvaney… - Proceedings of the 30th …, 2021 - dl.acm.org
PyTerrier is a Python-based retrieval framework for expressing simple and complex
information retrieval (IR) pipelines in a declarative manner. While making use of the long …

Finding related tables in data lakes for interactive data science

Y Zhang, ZG Ives - Proceedings of the 2020 ACM SIGMOD International …, 2020 - dl.acm.org
Many modern data science applications build on data lakes, schema-agnostic repositories
of data files and data products that offer limited organization and management capabilities …

eXpression2Kinases (X2K) Web: linking expression signatures to upstream cell signaling networks

DJB Clarke, MV Kuleshov, BM Schilder… - Nucleic acids …, 2018 - academic.oup.com
While gene expression data at the mRNA level can be globally and accurately measured,
profiling the activity of cell signaling pathways is currently much more difficult …

Robotics Software: Past, Present, and Future

J Haviland, P Corke - Annual Review of Control, Robotics, and …, 2024 - annualreviews.org
Robotics is powered by software. Software tools control the rate of innovation in robotics
research, drive the growth of the robotics industry, and power the education of future …

CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community

TR Connor, NJ Loman, S Thompson… - Microbial …, 2016 - microbiologyresearch.org
The increasing availability and decreasing cost of high-throughput sequencing has
transformed academic medical microbiology, delivering an explosion in available genomes …

Towards scalable dataframe systems

D Petersohn, S Macke, D **n, W Ma, D Lee… - arxiv preprint arxiv …, 2020 - arxiv.org
Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the
remarkable success of dataframe libraries in Rand Python, dataframes face performance …