A survey of synthetic data augmentation methods in machine vision

A Mumuni, F Mumuni, NK Gerrar - Machine Intelligence Research, 2024 - Springer
The standard approach to tackling computer vision problems is to train deep convolutional
neural network (CNN) models using large-scale image datasets that are representative of …

Ai-generated content (aigc) for various data modalities: A survey

LG Foo, H Rahmani, J Liu - arxiv preprint arxiv:2308.14177, 2023 - arxiv.org
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and
other media using AI algorithms. Due to its wide range of applications and the demonstrated …

Conditional image-to-video generation with latent flow diffusion models

H Ni, C Shi, K Li, SX Huang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video
starting from an image (eg, a person's face) and a condition (eg, an action class label like …

Long video generation with time-agnostic vqgan and time-sensitive transformer

S Ge, T Hayes, H Yang, X Yin, G Pang… - … on Computer Vision, 2022 - Springer
Videos are created to express emotion, exchange information, and share experiences.
Video synthesis has intrigued researchers for a long time. Despite the rapid progress driven …

Deepfake video detection using convolutional vision transformer

D Wodajo, S Atnafu - arxiv preprint arxiv:2102.11126, 2021 - arxiv.org
The rapid advancement of deep learning models that can generate and synthesis hyper-
realistic videos known as Deepfakes and their ease of access to the general public have …

Generative adversarial networks for image and video synthesis: Algorithms and applications

MY Liu, X Huang, J Yu, TC Wang… - Proceedings of the …, 2021 - ieeexplore.ieee.org
The generative adversarial network (GAN) framework has emerged as a powerful tool for
various image and video synthesis tasks, allowing the synthesis of visual content in an …

Dreamwalker: Mental planning for continuous vision-language navigation

H Wang, W Liang, L Van Gool… - Proceedings of the …, 2023 - openaccess.thecvf.com
VLN-CE is a recently released embodied task, where AI agents need to navigate a freely
traversable environment to reach a distant target location, given language instructions. It …

Enhancing photorealism enhancement

SR Richter, HA AlHaija, V Koltun - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
We present an approach to enhancing the realism of synthetic images. The images are
enhanced by a convolutional network that leverages intermediate representations produced …

Citydreamer: Compositional generative model of unbounded 3d cities

H **e, Z Chen, F Hong, Z Liu - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com
Abstract 3D city generation is a desirable yet challenging task since humans are more
sensitive to structural distortions in urban environments. Additionally generating 3D cities is …

Taming visually guided sound generation

V Iashin, E Rahtu - arxiv preprint arxiv:2110.08791, 2021 - arxiv.org
Recent advances in visually-induced audio generation are based on sampling short, low-
fidelity, and one-class sounds. Moreover, sampling 1 second of audio from the state-of-the …