Socratic models: Composing zero-shot multimodal reasoning with language
Large pretrained (eg," foundation") models exhibit distinct capabilities depending on the
domain of data they are trained on. While these domains are generic, they may only barely …
domain of data they are trained on. While these domains are generic, they may only barely …
Deep reinforcement learning at the edge of the statistical precipice
Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing
their relative performance on a large suite of tasks. Most published results on deep RL …
their relative performance on a large suite of tasks. Most published results on deep RL …
Cliport: What and where pathways for robotic manipulation
How can we imbue robots with the ability to manipulate objects precisely but also to reason
about them in terms of abstract concepts? Recent works in manipulation have shown that …
about them in terms of abstract concepts? Recent works in manipulation have shown that …
Toward causal representation learning
The two fields of machine learning and graphical causality arose and are developed
separately. However, there is, now, cross-pollination and increasing interest in both fields to …
separately. However, there is, now, cross-pollination and increasing interest in both fields to …
Object-centric learning with slot attention
Learning object-centric representations of complex scenes is a promising step towards
enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep …
enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep …
Transporter networks: Rearranging the visual world for robotic manipulation
Robotic manipulation can be formulated as inducing a sequence of spatial displacements:
where the space being moved can encompass an object, part of an object, or end effector. In …
where the space being moved can encompass an object, part of an object, or end effector. In …
On the binding problem in artificial neural networks
Contemporary neural networks still fall short of human-level generalization, which extends
far beyond our direct experiences. In this paper, we argue that the underlying cause for this …
far beyond our direct experiences. In this paper, we argue that the underlying cause for this …
Picie: Unsupervised semantic segmentation using invariance and equivariance in clustering
We present a new framework for semantic segmentation without annotations via clustering.
Off-the-shelf clustering methods are limited to curated, single-label, and object-centric …
Off-the-shelf clustering methods are limited to curated, single-label, and object-centric …
Digit: A novel design for a low-cost compact high-resolution tactile sensor with application to in-hand manipulation
Despite decades of research, general purpose in-hand manipulation remains one of the
unsolved challenges of robotics. One of the contributing factors that limit current robotic …
unsolved challenges of robotics. One of the contributing factors that limit current robotic …
Human-to-robot imitation in the wild
We approach the problem of learning by watching humans in the wild. While traditional
approaches in Imitation and Reinforcement Learning are promising for learning in the real …
approaches in Imitation and Reinforcement Learning are promising for learning in the real …