Scientific discovery in the age of artificial intelligence
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment
and accelerate research, hel** scientists to generate hypotheses, design experiments …
and accelerate research, hel** scientists to generate hypotheses, design experiments …
Advances, challenges and opportunities in creating data for trustworthy AI
As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …
Datacomp: In search of the next generation of multimodal datasets
Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable
Diffusion and GPT-4, yet their design does not receive the same research attention as model …
Diffusion and GPT-4, yet their design does not receive the same research attention as model …
Ask me anything: A simple strategy for prompting language models
Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a
natural language prompt that demonstrates how to perform the task and no additional …
natural language prompt that demonstrates how to perform the task and no additional …
Pervasive label errors in test sets destabilize machine learning benchmarks
We identify label errors in the test sets of 10 of the most commonly-used computer vision,
natural language, and audio datasets, and subsequently study the potential for these label …
natural language, and audio datasets, and subsequently study the potential for these label …
Data-centric artificial intelligence: A survey
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler
of its great success is the availability of abundant and high-quality data for building machine …
of its great success is the availability of abundant and high-quality data for building machine …
Confident learning: Estimating uncertainty in dataset labels
Learning exists in the context of data, yet notions of confidence typically focus on model
predictions, not label quality. Confident learning (CL) is an alternative approach which …
predictions, not label quality. Confident learning (CL) is an alternative approach which …
Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis
Supervised training of deep learning models requires large labeled datasets. There is a
growing interest in obtaining such datasets for medical image analysis applications …
growing interest in obtaining such datasets for medical image analysis applications …
[HTML][HTML] A state-of-the-art survey on deep learning theory and architectures
In recent years, deep learning has garnered tremendous success in a variety of application
domains. This new field of machine learning has been growing rapidly and has been …
domains. This new field of machine learning has been growing rapidly and has been …
A survey on data collection for machine learning: a big data-ai integration perspective
Data collection is a major bottleneck in machine learning and an active research topic in
multiple communities. There are largely two reasons data collection has recently become a …
multiple communities. There are largely two reasons data collection has recently become a …