Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Wavchat: A survey of spoken dialogue models
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o,
have captured significant attention in the speech domain. Compared to traditional three-tier …
have captured significant attention in the speech domain. Compared to traditional three-tier …
Recent Advances in Discrete Speech Tokens: A Review
The rapid advancement of speech generation technologies in the era of large language
models (LLMs) has established discrete speech tokens as a foundational paradigm for …
models (LLMs) has established discrete speech tokens as a foundational paradigm for …
Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition
Current research in audio deepfake detection is gradually transitioning from binary
classification to multi-class tasks, referred as audio deepfake source tracing task. However …
classification to multi-class tasks, referred as audio deepfake source tracing task. However …
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Large language models have revolutionized natural language processing through self-
supervised pretraining on massive datasets. Inspired by this success, researchers have …
supervised pretraining on massive datasets. Inspired by this success, researchers have …
The ICME 2025 Audio Encoder Capability Challenge
J Zhang, H Dinkel, Q Song, H Wang, Y Niu… - arxiv preprint arxiv …, 2025 - arxiv.org
This challenge aims to evaluate the capabilities of audio encoders, especially in the context
of multi-task learning and real-world applications. Participants are invited to submit pre …
of multi-task learning and real-world applications. Participants are invited to submit pre …
LUCY: Linguistic Understanding and Control Yielding Early Stage of Her
The film Her features Samantha, a sophisticated AI audio agent who is capable of
understanding both linguistic and paralinguistic information in human speech and delivering …
understanding both linguistic and paralinguistic information in human speech and delivering …
CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset
With the rapid advancement of codec-based speech generation (CoSG) systems, creating
fake speech that mimics an individual's identity and spreads misinformation has become …
fake speech that mimics an individual's identity and spreads misinformation has become …
Artificial Intelligence in Creative Industries: Advances Prior to 2025
The rapid advancements in artificial intelligence (AI), particularly in generative AI and large
language models (LLMs), have profoundly impacted the creative industries by enabling …
language models (LLMs), have profoundly impacted the creative industries by enabling …
MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech Model
M Baas, P Scholtz, A Mehta, E Dyson… - arxiv preprint arxiv …, 2025 - arxiv.org
Codec-based text-to-speech (TTS) models have shown impressive quality with zero-shot
voice cloning abilities. However, they often struggle with more expressive references or …
voice cloning abilities. However, they often struggle with more expressive references or …
DAC-JAX: A JAX Implementation of the Descript Audio Codec
D Braun - arxiv preprint arxiv:2405.11554, 2024 - arxiv.org
We present an open-source implementation of the Descript Audio Codec (DAC) using
Google's JAX ecosystem of Flax, Optax, Orbax, AUX, and CLU. Our codebase enables the …
Google's JAX ecosystem of Flax, Optax, Orbax, AUX, and CLU. Our codebase enables the …