Understanding movies and their structural patterns is a crucial task in decoding the craft of video editing. While previous works have developed tools for general analysis, such as …
MK He, J Du, CH Lee - Proc. Interspeech, 2022 - drive.google.com
In this paper, we propose a novel end-to-end neural-networkbased audio-visual speaker diarization method. Unlike most existing audio-visual methods, our audio-visual model takes …