Google Академія

J Lai, W Gan, J Wu, Z Qi, SY Philip - AI Open, 2024 - Elsevier

The advent of artificial intelligence (AI) has significantly impacted the traditional judicial
industry. Moreover, recently, with the development of the concept of AI-generated content …

Зберегти Послатися Цитовано в 67 джерелах Пов’язані статті Кількість версій: 4

A review on the attention mechanism of deep learning

Z Niu, G Zhong, H Yu - Neurocomputing, 2021 - Elsevier

Attention has arguably become one of the most important concepts in the deep learning
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …

Зберегти Послатися Цитовано в 2455 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Generating diverse and natural 3d human motions from text

C Guo, S Zou, X Zuo, S Wang, W Ji… - Proceedings of the …, 2022 - openaccess.thecvf.com

Automated generation of 3D human motions from text is a challenging problem. The
generated motions are expected to be sufficiently diverse to explore the text-grounded …

Зберегти Послатися Цитовано в 520 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tm2t: Stochastic and tokenized modeling for the reciprocal generation of 3d human motions and texts

C Guo, X Zuo, S Wang, L Cheng - European Conference on Computer …, 2022 - Springer

Inspired by the strong ties between vision and language, the two intimate human sensing
and communication modalities, our paper aims to explore the generation of 3D human full …

Зберегти Послатися Цитовано в 212 джерелах Пов’язані статті Кількість версій: 10

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Next-qa: Next phase of question-answering to explaining temporal actions

J **ao, X Shang, A Yao… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We introduce NExT-QA, a rigorously designed video question answering (VideoQA)
benchmark to advance video understanding from describing to explaining the temporal …

Зберегти Послатися Цитовано в 382 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Video recap: Recursive captioning of hour-long videos

MM Islam, N Ho, X Yang, T Nagarajan… - Proceedings of the …, 2024 - openaccess.thecvf.com

Most video captioning models are designed to process short video clips of few seconds and
output text describing low-level visual concepts (eg objects scenes atomic actions). However …

Зберегти Послатися Цитовано в 31 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Ai choreographer: Music conditioned 3d dance generation with aist++

R Li, S Yang, DA Ross… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We present AIST++, a new multi-modal dataset of 3D dance motion and music, along with
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …

Зберегти Послатися Цитовано в 513 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] cranfield.ac.uk

Review and perspectives on driver digital twin and its enabling technologies for intelligent vehicles

Z Hu, S Lou, Y **ng, X Wang, D Cao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Digital Twin (DT) is an emerging technology and has been introduced into intelligent driving
and transportation systems to digitize and synergize connected automated vehicles …

Зберегти Послатися Цитовано в 145 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Autoad: Movie description in context

T Han, M Bain, A Nagrani, G Varol… - Proceedings of the …, 2023 - openaccess.thecvf.com

The objective of this paper is an automatic Audio Description (AD) model that ingests movies
and outputs AD in text form. Generating high-quality movie AD is challenging due to the …

Зберегти Послатися Цитовано в 61 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dolphins: Multimodal language model for driving

Y Ma, Y Cao, J Sun, M Pavone, C **ao - European Conference on …, 2024 - Springer

The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world
scenarios with human-like understanding and responsiveness. In this paper, we introduce …

Зберегти Послатися Цитовано в 48 джерелах Пов’язані статті Кількість версій: 7

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Sequence to sequence-video to text

[HTML][HTML] Large language models in law: A survey

A review on the attention mechanism of deep learning

Generating diverse and natural 3d human motions from text

Tm2t: Stochastic and tokenized modeling for the reciprocal generation of 3d human motions and texts

Next-qa: Next phase of question-answering to explaining temporal actions

Video recap: Recursive captioning of hour-long videos

Ai choreographer: Music conditioned 3d dance generation with aist++

Review and perspectives on driver digital twin and its enabling technologies for intelligent vehicles

Autoad: Movie description in context

Dolphins: Multimodal language model for driving