Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Neural target speech extraction: An overview
K Zmolikova, M Delcroix, T Ochiai… - IEEE Signal …, 2023 - ieeexplore.ieee.org
Humans can listen to a target speaker even in challenging acoustic conditions that have
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
Uniaudio: An audio foundation model toward universal audio generation
Large Language models (LLM) have demonstrated the capability to handle a variety of
generative tasks. This paper presents the UniAudio system, which, unlike prior task-specific …
generative tasks. This paper presents the UniAudio system, which, unlike prior task-specific …
Speechx: Neural codec language model as a versatile speech transformer
Recent advancements in generative speech models based on audio-text prompts have
enabled remarkable innovations like high-quality zero-shot text-to-speech. However …
enabled remarkable innovations like high-quality zero-shot text-to-speech. However …
Overview of speaker modeling and its applications: From the lens of deep speaker representation learning
Speaker individuality information is among the most critical elements within speech signals.
By thoroughly and accurately modeling this information, it can be utilized in various …
By thoroughly and accurately modeling this information, it can be utilized in various …
Uniaudio: Towards universal audio generation with large language models
Audio generation is a major branch of generative AI research. Compared with prior works in
this area that are commonly task-specific with heavy domain knowledge, this paper …
this area that are commonly task-specific with heavy domain knowledge, this paper …
Spex+: A complete time domain speaker extraction network
Speaker extraction aims to extract the target speech signal from a multi-talker environment
given a target speaker's reference speech. We recently proposed a time-domain solution …
given a target speaker's reference speech. We recently proposed a time-domain solution …
Deep learning-based non-intrusive multi-objective speech assessment model with cross-domain features
This study proposes a cross-domain multi-objective speech assessment model, called
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …
Speech enhancement using self-adaptation and multi-head self-attention
This paper investigates a self-adaptation method for speech enhancement using auxiliary
speaker-aware features; we extract a speaker representation used for adaptation directly …
speaker-aware features; we extract a speaker representation used for adaptation directly …
Far-field automatic speech recognition
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …
far-field automatic speech recognition (ASR), has received a significant increase in attention …
Look once to hear: Target speech hearing with noisy examples
In crowded settings, the human brain can focus on speech from a target speaker, given prior
knowledge of how they sound. We introduce a novel intelligent hearable system that …
knowledge of how they sound. We introduce a novel intelligent hearable system that …