Data and its (dis) contents: A survey of dataset development and use in machine learning research

A Paullada, ID Raji, EM Bender, E Denton, A Hanna - Patterns, 2021 - cell.com
In this work, we survey a breadth of literature that has revealed the limitations of
predominant practices for dataset collection and use in the field of machine learning. We …

Video description: A survey of methods, datasets, and evaluation metrics

N Aafaq, A Mian, W Liu, SZ Gilani, M Shah - ACM Computing Surveys …, 2019 - dl.acm.org
Video description is the automatic generation of natural language sentences that describe
the contents of a given video. It has applications in human-robot interaction, hel** the …

End-to-end learning of visual representations from uncurated instructional videos

A Miech, JB Alayrac, L Smaira… - Proceedings of the …, 2020 - openaccess.thecvf.com
Annotating videos is cumbersome, expensive and not scalable. Yet, many strong video
models still rely on manually annotated data. With the recent introduction of the HowTo100M …

Deep visual-semantic alignments for generating image descriptions

A Karpathy, L Fei-Fei - Proceedings of the IEEE conference on …, 2015 - cv-foundation.org
We present a model that generates natural language descriptions of images and their
regions. Our approach leverages datasets of images and their sentence descriptions to …

Labeled faces in the wild: A survey

E Learned-Miller, GB Huang, A RoyChowdhury… - Advances in face …, 2016 - Springer
Abstract In 2007, Labeled Faces in the Wild was released in an effort to spur research in
face recognition, specifically for the problem of face verification with unconstrained images …

[PDF][PDF] Cross-pose lfw: A database for studying cross-pose face recognition in unconstrained environments

T Zheng, W Deng - Bei**g University of Posts and Telecommunications …, 2018 - whdeng.cn
Abstract Labeled Faces in the Wild (LFW) database has been widely utilized as the
benchmark of unconstrained face verification. Recently, due to big data driven machine …

Learning to separate object sounds by watching unlabeled video

R Gao, R Feris, K Grauman - Proceedings of the European …, 2018 - openaccess.thecvf.com
Perceiving a scene most fully requires all the senses. Yet modeling how objects look and
sound is challenging: most natural scenes and events contain multiple objects, and the …

A data-driven approach to cleaning large face datasets

HW Ng, S Winkler - 2014 IEEE international conference on …, 2014 - ieeexplore.ieee.org
Large face datasets are important for advancing face recognition research, but they are
tedious to build, because a lot of work has to go into cleaning the huge amount of raw data …