Advances, challenges and opportunities in creating data for trustworthy AI
As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …
Machine learning for high-entropy alloys: Progress, challenges and opportunities
High-entropy alloys (HEAs) have attracted extensive interest due to their exceptional
mechanical properties and the vast compositional space for new HEAs. However …
mechanical properties and the vast compositional space for new HEAs. However …
Segment anything
Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for
image segmentation. Using our efficient model in a data collection loop, we built the largest …
image segmentation. Using our efficient model in a data collection loop, we built the largest …
Datacomp: In search of the next generation of multimodal datasets
Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable
Diffusion and GPT-4, yet their design does not receive the same research attention as model …
Diffusion and GPT-4, yet their design does not receive the same research attention as model …
Beyond neural scaling laws: beating power law scaling via data pruning
Widely observed neural scaling laws, in which error falls off as a power of the training set
size, model size, or both, have driven substantial performance improvements in deep …
size, model size, or both, have driven substantial performance improvements in deep …
Intriguing properties of vision transformers
Vision transformers (ViT) have demonstrated impressive performance across numerous
machine vision tasks. These models are based on multi-head self-attention mechanisms that …
machine vision tasks. These models are based on multi-head self-attention mechanisms that …
[PDF][PDF] The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence
K Crawford - 2021 - static10.labirint.ru
The hidden costs of artificial intelligence, from natural resources and labor to privacy and
freedom What happens when artificial intelligence saturates political life and depletes the …
freedom What happens when artificial intelligence saturates political life and depletes the …
Multimodal datasets: misogyny, pornography, and malignant stereotypes
We have now entered the era of trillion parameter machine learning models trained on
billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has …
billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has …
Kubric: A scalable dataset generator
Data is the driving force of machine learning, with the amount and quality of training data
often being more important for the performance of a system than architecture and training …
often being more important for the performance of a system than architecture and training …
Swad: Domain generalization by seeking flat minima
Abstract Domain generalization (DG) methods aim to achieve generalizability to an unseen
target domain by using only training data from the source domains. Although a variety of DG …
target domain by using only training data from the source domains. Although a variety of DG …