- Academic Search

HY Koh, J Ju, M Liu, S Pan - ACM computing surveys, 2022 - dl.acm.org

Long documents such as academic articles and business reports have been the standard
format to detail out important issues and complicated subjects that require extra attention. An …

Enregistrer Citer Cité 123 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Voicecraft: Zero-shot speech editing and text-to-speech in the wild

P Peng, PY Huang, SW Li, A Mohamed… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce VoiceCraft, a token infilling neural codec language model, that achieves state-
of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on …

Enregistrer Citer Cité 48 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mit.edu

SpiRit-LM: Interleaved Spoken and Written Language Model

TA Nguyen, B Muller, B Yu, MR Costa-Jussa… - Transactions of the …, 2025 - direct.mit.edu

We introduce SpiRit-lm, a foundation multimodal language model that freely mixes text and
speech. Our model is based on a 7B pretrained text language model that we extend to the …

Enregistrer Citer Cité 27 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Summscreen: A dataset for abstractive screenplay summarization

M Chen, Z Chu, S Wiseman, K Gimpel - arxiv preprint arxiv:2104.07091, 2021 - arxiv.org

We introduce SummScreen, a summarization dataset comprised of pairs of TV series
transcripts and human written recaps. The dataset provides a challenging testbed for …

Enregistrer Citer Cité 127 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Expresso: A benchmark and analysis of discrete expressive speech resynthesis

TA Nguyen, WN Hsu, A d'Avirro, B Shi, I Gat… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent work has shown that it is possible to resynthesize high-quality speech based, not on
text, but on low bitrate discrete units that have been learned in a self-supervised fashion and …

Enregistrer Citer Cité 47 fois Autres articles Les 9 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Building real-world meeting summarization systems using large language models: A practical perspective

MTR Laskar, XY Fu, C Chen, SB Tn - arxiv preprint arxiv:2310.19233, 2023 - arxiv.org

This paper studies how to effectively build meeting summarization systems for real-world
usage using large language models (LLMs). For this purpose, we conduct an extensive …

Enregistrer Citer Cité 31 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] aclanthology.org

Speech-Text Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment

T Yu, H Gao, TE Lin, M Yang, Y Wu, W Ma… - Proceedings of the …, 2023 - aclanthology.org

Recently, speech-text pre-training methods have shown remarkable success in many
speech and natural language processing tasks. However, most previous pre-trained models …

Enregistrer Citer Cité 17 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

MeetingBank: A benchmark dataset for meeting summarization

Y Hu, T Ganter, H Deilamsalehy, F Dernoncourt… - arxiv preprint arxiv …, 2023 - arxiv.org

As the number of recorded meetings increases, it becomes increasingly important to utilize
summarization technology to create useful summaries of these recordings. However, there is …

Enregistrer Citer Cité 36 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Long-span summarization via local attention and content selection

P Manakul, MJF Gales - arxiv preprint arxiv:2105.03801, 2021 - arxiv.org

Transformer-based models have achieved state-of-the-art results in a wide range of natural
language processing (NLP) tasks including document summarization. Typically these …

Enregistrer Citer Cité 57 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] aclanthology.org

How might we create better benchmarks for speech recognition?

A Aksënova, D van Esch, J Flynn… - Proceedings of the 1st …, 2021 - aclanthology.org

The applications of automatic speech recognition (ASR) systems are proliferating, in part
due to recent significant quality improvements. However, as recent work indicates, even …

Enregistrer Citer Cité 46 fois Autres articles Les 9 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

100,000 podcasts: A spoken English document corpus

An empirical survey on long document summarization: Datasets, models, and metrics

Voicecraft: Zero-shot speech editing and text-to-speech in the wild

SpiRit-LM: Interleaved Spoken and Written Language Model

Summscreen: A dataset for abstractive screenplay summarization

Expresso: A benchmark and analysis of discrete expressive speech resynthesis

Building real-world meeting summarization systems using large language models: A practical perspective

Speech-Text Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment

MeetingBank: A benchmark dataset for meeting summarization

Long-span summarization via local attention and content selection

How might we create better benchmarks for speech recognition?