- Academic Search

Z **ng, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Speichern Zitieren Zitiert von: 92 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] arxiv.org

Evaluating text-to-visual generation with image-to-text generation

Z Lin, D Pathak, B Li, J Li, X **a, G Neubig… - … on Computer Vision, 2024 - Springer

Despite significant progress in generative AI, comprehensive evaluation remains
challenging because of the lack of effective metrics and standardized benchmarks. For …

Speichern Zitieren Zitiert von: 61 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] frontiersin.org

Deepfake: definitions, performance metrics and standards, datasets, and a meta-review

E Altuncu, VNL Franqueira, S Li - Frontiers in Big Data, 2024 - frontiersin.org

Recent advancements in AI, especially deep learning, have contributed to a significant
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …

Speichern Zitieren Zitiert von: 4 Ähnliche Artikel Alle 7 Versionen Im Cache

[Free GPT-4]

[PDF] arxiv.org

Videoscore: Building automatic metrics to simulate fine-grained human feedback for video generation

X He, D Jiang, G Zhang, M Ku, A Soni, S Siu… - arxiv preprint arxiv …, 2024 - arxiv.org

The recent years have witnessed great advances in video generation. However, the
development of automatic video metrics is lagging significantly behind. None of the existing …

Speichern Zitieren Zitiert von: 23 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Chronomagic-bench: A benchmark for metamorphic evaluation of text-to-time-lapse video generation

S Yuan, J Huang, Y Xu, Y Liu, S Zhang, Y Shi… - arxiv preprint arxiv …, 2024 - arxiv.org

We propose a novel text-to-video (T2V) generation benchmark, ChronoMagic-Bench, to
evaluate the temporal and metamorphic capabilities of the T2V models (eg Sora and …

Speichern Zitieren Zitiert von: 17 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] thecvf.com

Evaluating and Improving Compositional Text-to-Visual Generation

B Li, Z Lin, D Pathak, J Li, Y Fei, K Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

While text-to-visual models now produce photo-realistic images and videos they struggle
with compositional text prompts involving attributes relationships and higher-order …

Speichern Zitieren Zitiert von: 9 Ähnliche Artikel HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Subjective-aligned dataset and metric for text-to-video quality assessment

T Kou, X Liu, Z Zhang, C Li, H Wu, X Min… - Proceedings of the …, 2024 - dl.acm.org

With the rapid development of generative models, AI-Generated Content (AIGC) has
exponentially increased in daily lives. Among them, Text-to-Video (T2V) generation has …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] arxiv.org

Is sora a world simulator? a comprehensive survey on general world models and beyond

Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou… - arxiv preprint arxiv …, 2024 - arxiv.org

General world models represent a crucial pathway toward achieving Artificial General
Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual …

Speichern Zitieren Zitiert von: 33 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Improving dynamic object interactions in text-to-video generation with ai feedback

H Furuta, H Zen, D Schuurmans, A Faust… - arxiv preprint arxiv …, 2024 - arxiv.org

Large text-to-video models hold immense potential for a wide range of downstream
applications. However, these models struggle to accurately depict dynamic object …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

L Chen, Z Wang, S Ren, L Li, H Zhao, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Building on the foundations of language modeling in natural language processing, Next
Token Prediction (NTP) has evolved into a versatile training objective for machine learning …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel HTML-Version

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

A survey on video diffusion models

Evaluating text-to-visual generation with image-to-text generation

Deepfake: definitions, performance metrics and standards, datasets, and a meta-review

Videoscore: Building automatic metrics to simulate fine-grained human feedback for video generation

Chronomagic-bench: A benchmark for metamorphic evaluation of text-to-time-lapse video generation

Evaluating and Improving Compositional Text-to-Visual Generation

Subjective-aligned dataset and metric for text-to-video quality assessment

Is sora a world simulator? a comprehensive survey on general world models and beyond

Improving dynamic object interactions in text-to-video generation with ai feedback

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey