SweetTokenizer: Semantic-Aware Spatial-Temporal Tokenizer for Compact Visual Discretization

Z Tan, B Xue, J Jia, J Wang, W Ye, S Shi, M Sun… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper presents the\textbf {S} emantic-a\textbf {W} ar\textbf {E} spatial-t\textbf {E}
mporal\textbf {T} okenizer (SweetTokenizer), a compact yet effective discretization approach …

STORM: A Spatio-Temporal Factor Model Based on Dual Vector Quantized Variational Autoencoders for Financial Trading

Y Zhao, W Zhang, T Yang, Y Jiang, F Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
In financial trading, factor models are widely used to price assets and capture excess returns
from mispricing. Recently, we have witnessed the rise of variational autoencoder-based …

SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer

H Chen, Z Wang, X Li, X Sun, F Chen, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Efficient image tokenization with high compression ratios remains a critical challenge for
training generative models. We present SoftVQ-VAE, a continuous image tokenizer that …

Diverse Code Query Learning for Speech-Driven Facial Animation

C Gu, S Kuriyama, K Hotta - arxiv preprint arxiv:2409.19143, 2024 - arxiv.org
Speech-driven facial animation aims to synthesize lip-synchronized 3D talking faces
following the given speech signal. Prior methods to this task mostly focus on pursuing …

LG-VQ: Language-Guided Codebook Learning

L Guotao, B Zhang, Y Wang, Y Ye, X Li… - The Thirty-eighth Annual … - openreview.net
Vector quantization (VQ) is a key technique in high-resolution and high-fidelity image
synthesis, which aims to learn a codebook to encode an image with a sequence of discrete …