An empirical survey on long document summarization: Datasets, models, and metrics

HY Koh, J Ju, M Liu, S Pan - ACM computing surveys, 2022 - dl.acm.org
Long documents such as academic articles and business reports have been the standard
format to detail out important issues and complicated subjects that require extra attention. An …

A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

QMSum: A new benchmark for query-based multi-domain meeting summarization

M Zhong, D Yin, T Yu, A Zaidi, M Mutuma, R Jha… - arxiv preprint arxiv …, 2021 - arxiv.org
Meetings are a key component of human collaboration. As increasing numbers of meetings
are recorded and transcribed, meeting summaries have become essential to remind those …

[HTML][HTML] Voxceleb: Large-scale speaker verification in the wild

A Nagrani, JS Chung, W **e, A Zisserman - Computer Speech & Language, 2020 - Elsevier
The objective of this work is speaker recognition under noisy and unconstrained conditions.
We make two key contributions. First, we introduce a very large-scale audio-visual dataset …

Argument mining: A survey

J Lawrence, C Reed - Computational Linguistics, 2020 - direct.mit.edu
Argument mining is the automatic identification and extraction of the structure of inference
and reasoning expressed as arguments presented in natural language. Understanding …

CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

S Watanabe, M Mandel, J Barker, E Vincent… - arxiv preprint arxiv …, 2020 - arxiv.org
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …

Librimix: An open-source dataset for generalizable speech separation

J Cosentino, M Pariente, S Cornell, A Deleforge… - arxiv preprint arxiv …, 2020 - arxiv.org
In recent years, wsj0-2mix has become the reference dataset for single-channel speech
separation. Most deep learning-based speech separation models today are benchmarked …

Voxceleb: a large-scale speaker identification dataset

A Nagrani, JS Chung, A Zisserman - arxiv preprint arxiv:1706.08612, 2017 - arxiv.org
Most existing datasets for speaker identification contain samples obtained under quite
constrained conditions, and are usually hand-annotated, hence limited in size. The goal of …

SLURP: A spoken language understanding resource package

E Bastianelli, A Vanzo, P Swietojanski… - arxiv preprint arxiv …, 2020 - arxiv.org
Spoken Language Understanding infers semantic meaning directly from audio data, and
thus promises to reduce error propagation and misunderstandings in end-user applications …

Dialoglm: Pre-trained model for long dialogue understanding and summarization

M Zhong, Y Liu, Y Xu, C Zhu, M Zeng - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Dialogue is an essential part of human communication and cooperation. Existing research
mainly focuses on short dialogue scenarios in a one-on-one fashion. However, multi-person …