Beyond neural scaling laws: beating power law scaling via data pruning
Widely observed neural scaling laws, in which error falls off as a power of the training set
size, model size, or both, have driven substantial performance improvements in deep …
size, model size, or both, have driven substantial performance improvements in deep …
Kubric: A scalable dataset generator
Data is the driving force of machine learning, with the amount and quality of training data
often being more important for the performance of a system than architecture and training …
often being more important for the performance of a system than architecture and training …
Generalized category discovery
In this paper, we consider a highly general image recognition setting wherein, given a
labelled and unlabelled set of images, the task is to categorize all images in the unlabelled …
labelled and unlabelled set of images, the task is to categorize all images in the unlabelled …
Towards large-scale 3d representation learning with multi-dataset point prompt training
The rapid advancement of deep learning models is often attributed to their ability to leverage
massive training data. In contrast such privilege has not yet fully benefited 3D deep learning …
massive training data. In contrast such privilege has not yet fully benefited 3D deep learning …
A study of face obfuscation in imagenet
Face obfuscation (blurring, mosaicing, etc.) has been shown to be effective for privacy
protection; nevertheless, object recognition research typically assumes access to complete …
protection; nevertheless, object recognition research typically assumes access to complete …
Segmenting moving objects via an object-centric layered representation
The objective of this paper is a model that is able to discover, track and segment multiple
moving objects in a video. We make four contributions: First, we introduce an object-centric …
moving objects in a video. We make four contributions: First, we introduce an object-centric …
Firerisk: A remote sensing dataset for fire risk assessment with benchmarks using supervised and self-supervised learning
In recent decades, wildfires have caused tremendous property losses, fatalities, and
extensive damage to forest ecosystems. Inspired by the abundance of publicly available …
extensive damage to forest ecosystems. Inspired by the abundance of publicly available …
Data representativity for machine learning and AI systems
LH Clemmensen, RD Kjærsgaard - arxiv preprint arxiv:2203.04706, 2022 - arxiv.org
Data representativity is crucial when drawing inference from data through machine learning
models. Scholars have increased focus on unraveling the bias and fairness in models, also …
models. Scholars have increased focus on unraveling the bias and fairness in models, also …
Self-supervised learning with kernel dependence maximization
We approach self-supervised learning of image representations from a statistical
dependence perspective, proposing Self-Supervised Learning with the Hilbert-Schmidt …
dependence perspective, proposing Self-Supervised Learning with the Hilbert-Schmidt …
V-IRL: Grounding Virtual Intelligence in Real Life
There is a sensory gulf between the Earth that humans inhabit and the digital realms in
which modern AI agents are created. To develop AI agents that can sense, think, and act as …
which modern AI agents are created. To develop AI agents that can sense, think, and act as …