Beyond neural scaling laws: beating power law scaling via data pruning

B Sorscher, R Geirhos, S Shekhar… - Advances in …, 2022 - proceedings.neurips.cc
Widely observed neural scaling laws, in which error falls off as a power of the training set
size, model size, or both, have driven substantial performance improvements in deep …

Kubric: A scalable dataset generator

K Greff, F Belletti, L Beyer, C Doersch… - Proceedings of the …, 2022 - openaccess.thecvf.com
Data is the driving force of machine learning, with the amount and quality of training data
often being more important for the performance of a system than architecture and training …

Generalized category discovery

S Vaze, K Han, A Vedaldi… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we consider a highly general image recognition setting wherein, given a
labelled and unlabelled set of images, the task is to categorize all images in the unlabelled …

Towards large-scale 3d representation learning with multi-dataset point prompt training

X Wu, Z Tian, X Wen, B Peng, X Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
The rapid advancement of deep learning models is often attributed to their ability to leverage
massive training data. In contrast such privilege has not yet fully benefited 3D deep learning …

A study of face obfuscation in imagenet

K Yang, JH Yau, L Fei-Fei, J Deng… - International …, 2022 - proceedings.mlr.press
Face obfuscation (blurring, mosaicing, etc.) has been shown to be effective for privacy
protection; nevertheless, object recognition research typically assumes access to complete …

Segmenting moving objects via an object-centric layered representation

J **e, W **e, A Zisserman - Advances in neural information …, 2022 - proceedings.neurips.cc
The objective of this paper is a model that is able to discover, track and segment multiple
moving objects in a video. We make four contributions: First, we introduce an object-centric …

Firerisk: A remote sensing dataset for fire risk assessment with benchmarks using supervised and self-supervised learning

S Shen, S Seneviratne, X Wanyan… - … Conference on Digital …, 2023 - ieeexplore.ieee.org
In recent decades, wildfires have caused tremendous property losses, fatalities, and
extensive damage to forest ecosystems. Inspired by the abundance of publicly available …

Data representativity for machine learning and AI systems

LH Clemmensen, RD Kjærsgaard - arxiv preprint arxiv:2203.04706, 2022 - arxiv.org
Data representativity is crucial when drawing inference from data through machine learning
models. Scholars have increased focus on unraveling the bias and fairness in models, also …

Self-supervised learning with kernel dependence maximization

Y Li, R Pogodin, DJ Sutherland… - Advances in Neural …, 2021 - proceedings.neurips.cc
We approach self-supervised learning of image representations from a statistical
dependence perspective, proposing Self-Supervised Learning with the Hilbert-Schmidt …

V-IRL: Grounding Virtual Intelligence in Real Life

J Yang, R Ding, E Brown, X Qi, S **e - European Conference on Computer …, 2024 - Springer
There is a sensory gulf between the Earth that humans inhabit and the digital realms in
which modern AI agents are created. To develop AI agents that can sense, think, and act as …