Review the state-of-the-art technologies of semantic segmentation based on deep learning

Y Mo, Y Wu, X Yang, F Liu, Y Liao - Neurocomputing, 2022 - Elsevier
The goal of semantic segmentation is to segment the input image according to semantic
information and predict the semantic category of each pixel from a given label set. With the …

[書籍][B] Synthetic data for deep learning

SI Nikolenko - 2021 - Springer
You are holding in your hands… oh, come on, who holds books like this in their hands
anymore? Anyway, you are reading this, and it means that I have managed to release one of …

Going beyond nouns with vision & language models using synthetic data

P Cascante-Bonilla, K Shehada… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale pre-trained Vision & Language (VL) models have shown remarkable
performance in many applications, enabling replacing a fixed set of supported classes with …

Augmented reality meets computer vision: Efficient data generation for urban driving scenes

H Abu Alhaija, SK Mustikovela, L Mescheder… - International Journal of …, 2018 - Springer
The success of deep learning in computer vision is based on the availability of large
annotated datasets. To lower the need for hand labeled images, virtually rendered 3D …

Virtual kitti 2

Y Cabon, N Murray, M Humenberger - arxiv preprint arxiv:2001.10773, 2020 - arxiv.org
This paper introduces an updated version of the well-known Virtual KITTI dataset which
consists of 5 sequence clones from the KITTI tracking benchmark. In addition, the dataset …

Lcr-net++: Multi-person 2d and 3d pose detection in natural images

G Rogez, P Weinzaepfel… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
We propose an end-to-end architecture for joint 2D and 3D human pose estimation in
natural images. Key to our approach is the generation and scoring of a number of pose …

Scenenet rgb-d: Can 5m synthetic images beat generic imagenet pre-training on indoor segmentation?

J McCormac, A Handa… - Proceedings of the …, 2017 - openaccess.thecvf.com
Abstract We introduce SceneNet RGB-D, a dataset providing pixel-perfect ground truth for
scene understanding problems such as semantic segmentation, instance segmentation, and …

Vision-based human action recognition: An overview and real world challenges

I Jegham, AB Khalifa, I Alouani, MA Mahjoub - Forensic Science …, 2020 - Elsevier
Within a large range of applications in computer vision, Human Action Recognition has
become one of the most attractive research fields. Ambiguities in recognizing actions does …