Google Académico

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Guardar Citar Citado por 75 Artículos relacionados

[Free GPT-4]

[PDF] ieee.org

A comprehensive review on generative ai for education

U Mittal, S Sai, V Chamola - IEEE Access, 2024 - ieeexplore.ieee.org

Artificial Intelligence (AI) has immense potential for personalized learning experiences,
content generation, and vivid educational support. This paper delves into generative AI (GAI) …

Guardar Citar Citado por 17 Artículos relacionados

[Free GPT-4]

[PDF] neurips.cc

Simple and controllable music generation

J Copet, F Kreuk, I Gat, T Remez… - Advances in …, 2024 - proceedings.neurips.cc

We tackle the task of conditional music generation. We introduce MusicGen, a single
Language Model (LM) that operates over several streams of compressed discrete music …

Guardar Citar Citado por 452 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Shap-e: Generating conditional 3d implicit functions

H Jun, A Nichol - arxiv preprint arxiv:2305.02463, 2023 - arxiv.org

We present Shap-E, a conditional generative model for 3D assets. Unlike recent work on 3D
generative models which produce a single output representation, Shap-E directly generates …

Guardar Citar Citado por 396 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Towards generalist biomedical AI

T Tu, S Azizi, D Driess, M Schaekermann, M Amin… - NEJM AI, 2024 - ai.nejm.org

Background Medicine is inherently multimodal, requiring the simultaneous interpretation
and integration of insights between many data modalities spanning text, imaging, genomics …

Guardar Citar Citado por 319 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] arxiv.org

Photorealistic video generation with diffusion models

A Gupta, L Yu, K Sohn, X Gu, M Hahn, FF Li… - … on Computer Vision, 2024 - Springer

We present WALT, a diffusion transformer for photorealistic video generation from text
prompts. Our approach has two key design decisions. First, we use a causal encoder to …

Guardar Citar Citado por 128 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] arxiv.org

Videopoet: A large language model for zero-shot video generation

D Kondratyuk, L Yu, X Gu, J Lezama, J Huang… - arxiv preprint arxiv …, 2023 - arxiv.org

We present VideoPoet, a language model capable of synthesizing high-quality video, with
matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder …

Guardar Citar Citado por 176 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] neurips.cc

High-fidelity audio compression with improved rvqgan

R Kumar, P Seetharaman, A Luebs… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Language models have been successfully used to model natural signals, such as
images, speech, and music. A key component of these models is a high quality neural …

Guardar Citar Citado por 241 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

On the robustness of chatgpt: An adversarial and out-of-distribution perspective

J Wang, X Hu, W Hou, H Chen, R Zheng… - arxiv preprint arxiv …, 2023 - arxiv.org

ChatGPT is a recent chatbot service released by OpenAI and is receiving increasing
attention over the past few months. While evaluations of various aspects of ChatGPT have …

Guardar Citar Citado por 240 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] springer.com

Generative ai

S Feuerriegel, J Hartmann, C Janiesch… - Business & Information …, 2024 - Springer

Tom Freston is credited with saying ''Innovation is taking two things that exist and putting
them together in a new way''. For a long time in history, it has been the prevailing …

Guardar Citar Citado por 556 Artículos relacionados Las 15 versiones

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

A comprehensive review on generative ai for education

Simple and controllable music generation

Shap-e: Generating conditional 3d implicit functions

Towards generalist biomedical AI

Photorealistic video generation with diffusion models

Videopoet: A large language model for zero-shot video generation

High-fidelity audio compression with improved rvqgan

On the robustness of chatgpt: An adversarial and out-of-distribution perspective

Generative ai