[HTML][HTML] Data augmentation: A comprehensive survey of modern approaches

A Mumuni, F Mumuni - Array, 2022 - Elsevier
To ensure good performance, modern machine learning models typically require large
amounts of quality annotated data. Meanwhile, the data collection and annotation processes …

A survey of controllable text generation using transformer-based pre-trained language models

H Zhang, H Song, S Li, M Zhou, D Song - ACM Computing Surveys, 2023 - dl.acm.org
Controllable Text Generation (CTG) is an emerging area in the field of natural language
generation (NLG). It is regarded as crucial for the development of advanced text generation …

Edge: Editable dance generation from music

J Tseng, R Castellon, K Liu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Dance is an important human art form, but creating new dances can be difficult and time-
consuming. In this work, we introduce Editable Dance GEneration (EDGE), a state-of-the-art …

Learning fine-grained bimanual manipulation with low-cost hardware

TZ Zhao, V Kumar, S Levine, C Finn - arxiv preprint arxiv:2304.13705, 2023 - arxiv.org
Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously
difficult for robots because they require precision, careful coordination of contact forces, and …

A survey on trajectory-prediction methods for autonomous driving

Y Huang, J Du, Z Yang, Z Zhou… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
In order to drive safely in a dynamic environment, autonomous vehicles should be able to
predict the future states of traffic participants nearby, especially surrounding vehicles, similar …

High-resolution image synthesis with latent diffusion models

R Rombach, A Blattmann, D Lorenz… - Proceedings of the …, 2022 - openaccess.thecvf.com
By decomposing the image formation process into a sequential application of denoising
autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image …

Repurposing diffusion-based image generators for monocular depth estimation

B Ke, A Obukhov, S Huang, N Metzger… - Proceedings of the …, 2024 - openaccess.thecvf.com
Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth
from a single image is geometrically ill-posed and requires scene understanding so it is not …

Ambiguous medical image segmentation using diffusion models

A Rahman, JMJ Valanarasu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Collective insights from a group of experts have always proven to outperform an individual's
best diagnostic for clinical tasks. For the task of medical image segmentation, existing …

Motionclip: Exposing human motion generation to clip space

G Tevet, B Gordon, A Hertz, AH Bermano… - … on Computer Vision, 2022 - Springer
We introduce MotionCLIP, a 3D human motion auto-encoder featuring a latent embedding
that is disentangled, well behaved, and supports highly semantic textual descriptions …

Diffusion policies as an expressive policy class for offline reinforcement learning

Z Wang, JJ Hunt, M Zhou - arxiv preprint arxiv:2208.06193, 2022 - arxiv.org
Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously
collected static dataset, is an important paradigm of RL. Standard RL methods often perform …