- Academic Search

JN Acosta, GJ Falcone, P Rajpurkar, EJ Topol - Nature Medicine, 2022 - nature.com

The increasing availability of biomedical data from large biobanks, electronic health records,
medical imaging, wearable and ambient biosensors, and the lower cost of genome and …

Salva Cita Citato da 676 Articoli correlati Tutte e 5 le versioni

[Free GPT-4]

[PDF] sciencedirect.com

A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Salva Cita Citato da 227 Articoli correlati Tutte e 6 le versioni

[Free GPT-4]

[PDF] arxiv.org

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arxiv preprint arxiv …, 2023 - arxiv.org

The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

Salva Cita Citato da 2211 Articoli correlati Tutte e 11 le versioni Versione HTML

[Free GPT-4]

[PDF] springer.com

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

L Alzubaidi, J Bai, A Al-Sabaawi, J Santamaría… - Journal of Big Data, 2023 - Springer

Data scarcity is a major challenge when training deep learning (DL) models. DL demands a
large amount of data to achieve exceptional performance. Unfortunately, many applications …

Salva Cita Citato da 433 Articoli correlati Tutte e 9 le versioni

[Free GPT-4]

[PDF] thecvf.com

Eva: Exploring the limits of masked visual representation learning at scale

Y Fang, W Wang, B **e, Q Sun, L Wu… - Proceedings of the …, 2023 - openaccess.thecvf.com

We launch EVA, a vision-centric foundation model to explore the limits of visual
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …

Salva Cita Citato da 700 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Self-supervised learning from images with a joint-embedding predictive architecture

M Assran, Q Duval, I Misra… - Proceedings of the …, 2023 - openaccess.thecvf.com

This paper demonstrates an approach for learning highly semantic image representations
without relying on hand-crafted data-augmentations. We introduce the Image-based Joint …

Salva Cita Citato da 336 Articoli correlati Tutte e 7 le versioni Versione HTML

[Free GPT-4]

[PDF] mlr.press

Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models

R Huang, J Huang, D Yang, Y Ren… - International …, 2023 - proceedings.mlr.press

Large-scale multimodal generative modeling has created milestones in text-to-image and
text-to-video generation. Its application to audio still lags behind for two main reasons: the …

Salva Cita Citato da 315 Articoli correlati Tutte e 7 le versioni Versione HTML

[Free GPT-4]

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Salva Cita Citato da 214 Articoli correlati Tutte e 6 le versioni Ricerca biblioteche Versione HTML

[Free GPT-4]

[PDF] neurips.cc

Where are we in the search for an artificial visual cortex for embodied intelligence?

A Majumdar, K Yadav, S Arnaud, J Ma… - Advances in …, 2023 - proceedings.neurips.cc

We present the largest and most comprehensive empirical study of pre-trained visual
representations (PVRs) or visual 'foundation models' for Embodied AI. First, we curate …

Salva Cita Citato da 134 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Masked autoencoders for point cloud self-supervised learning

Y Pang, W Wang, FEH Tay, W Liu, Y Tian… - European conference on …, 2022 - Springer

As a promising scheme of self-supervised learning, masked autoencoding has significantly
advanced natural language processing and computer vision. Inspired by this, we propose a …

Salva Cita Citato da 530 Articoli correlati Tutte e 6 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Data2vec: A general framework for self-supervised learning in speech, vision and language

Multimodal biomedical AI

A review of deep learning techniques for speech processing

Dinov2: Learning robust visual features without supervision

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Eva: Exploring the limits of masked visual representation learning at scale

Self-supervised learning from images with a joint-embedding predictive architecture

Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models

Multimodal foundation models: From specialists to general-purpose assistants

Where are we in the search for an artificial visual cortex for embodied intelligence?

Masked autoencoders for point cloud self-supervised learning