Gaussctrl: Multi-view consistent text-driven 3d gaussian splatting editing
We propose GaussCtrl, a text-driven method to edit a 3D scene reconstructed by the 3D
Gaussian Splatting (3DGS). Our method first renders a collection of images by using the …
Gaussian Splatting (3DGS). Our method first renders a collection of images by using the …
Trailblazer: Trajectory control for diffusion-based video generation
Large text-to-video (T2V) models such as Sora have the potential to revolutionize visual
effects and the creation of some types of movies. Current T2V models require tedious trial …
effects and the creation of some types of movies. Current T2V models require tedious trial …
BAMM: bidirectional autoregressive motion model
Generating human motion from text has been dominated by denoising motion models either
through diffusion or generative masking process. However, these models face great …
through diffusion or generative masking process. However, these models face great …
Mismatch quest: Visual and textual feedback for image-text misalignment
While existing image-text alignment models reach high quality binary assessments, they fall
short of pinpointing the exact source of misalignment. In this paper, we present a method to …
short of pinpointing the exact source of misalignment. In this paper, we present a method to …
Action2sound: Ambient-aware generation of action sounds from egocentric videos
Generating realistic audio for human actions is important for many applications, such as
creating sound effects for films or virtual reality games. Existing approaches implicitly …
creating sound effects for films or virtual reality games. Existing approaches implicitly …
Towards building specialized generalist ai with system 1 and system 2 fusion
In this perspective paper, we introduce the concept of Specialized Generalist Artificial
Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence …
Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence …
Creativity in AI: Progresses and Challenges
Creativity is the ability to produce novel, useful, and surprising ideas, and has been widely
studied as a crucial aspect of human cognition. Machine creativity on the other hand has …
studied as a crucial aspect of human cognition. Machine creativity on the other hand has …
PS-StyleGAN: Illustrative Portrait Sketching Using Attention-Based Style Adaptation
Portrait sketching involves capturing identity specific attributes of a real face with abstract
lines and shades. Unlike photo-realistic images, a good portrait sketch generation method …
lines and shades. Unlike photo-realistic images, a good portrait sketch generation method …
Latent Diffusion for Guided Document Table Generation
Obtaining annotated table structure data for complex tables is a challenging task due to the
inherent diversity and complexity of real-world document layouts. The scarcity of publicly …
inherent diversity and complexity of real-world document layouts. The scarcity of publicly …
Position: Levels of AGI for Operationalizing Progress on the Path to AGI
We propose a framework for classifying the capabilities and behavior of Artificial General
Intelligence (AGI) models and their precursors. This framework introduces levels of AGI …
Intelligence (AGI) models and their precursors. This framework introduces levels of AGI …