[HTML][HTML] Large language models in law: A survey

J Lai, W Gan, J Wu, Z Qi, SY Philip - AI Open, 2024 - Elsevier
The advent of artificial intelligence (AI) has significantly impacted the traditional judicial
industry. Moreover, recently, with the development of the concept of AI-generated content …

A review on the attention mechanism of deep learning

Z Niu, G Zhong, H Yu - Neurocomputing, 2021 - Elsevier
Attention has arguably become one of the most important concepts in the deep learning
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …

Generating diverse and natural 3d human motions from text

C Guo, S Zou, X Zuo, S Wang, W Ji… - Proceedings of the …, 2022 - openaccess.thecvf.com
Automated generation of 3D human motions from text is a challenging problem. The
generated motions are expected to be sufficiently diverse to explore the text-grounded …

Tm2t: Stochastic and tokenized modeling for the reciprocal generation of 3d human motions and texts

C Guo, X Zuo, S Wang, L Cheng - European Conference on Computer …, 2022 - Springer
Inspired by the strong ties between vision and language, the two intimate human sensing
and communication modalities, our paper aims to explore the generation of 3D human full …

Next-qa: Next phase of question-answering to explaining temporal actions

J **ao, X Shang, A Yao… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We introduce NExT-QA, a rigorously designed video question answering (VideoQA)
benchmark to advance video understanding from describing to explaining the temporal …

Video recap: Recursive captioning of hour-long videos

MM Islam, N Ho, X Yang, T Nagarajan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Most video captioning models are designed to process short video clips of few seconds and
output text describing low-level visual concepts (eg objects scenes atomic actions). However …

Ai choreographer: Music conditioned 3d dance generation with aist++

R Li, S Yang, DA Ross… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We present AIST++, a new multi-modal dataset of 3D dance motion and music, along with
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …

Review and perspectives on driver digital twin and its enabling technologies for intelligent vehicles

Z Hu, S Lou, Y **ng, X Wang, D Cao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Digital Twin (DT) is an emerging technology and has been introduced into intelligent driving
and transportation systems to digitize and synergize connected automated vehicles …

Autoad: Movie description in context

T Han, M Bain, A Nagrani, G Varol… - Proceedings of the …, 2023 - openaccess.thecvf.com
The objective of this paper is an automatic Audio Description (AD) model that ingests movies
and outputs AD in text form. Generating high-quality movie AD is challenging due to the …

Dolphins: Multimodal language model for driving

Y Ma, Y Cao, J Sun, M Pavone, C **ao - European Conference on …, 2024 - Springer
The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world
scenarios with human-like understanding and responsiveness. In this paper, we introduce …