NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFs

M Fischer, Z Li, T Nguyen-Phuoc… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract A Neural Radiance Field (NeRF) encodes the specific relation of 3D geometry and
appearance of a scene. We here ask the question whether we can transfer the appearance …

Follow Anything: Open-Set Detection, Tracking, and Following in Real-Time

A Maalouf, N Jadhav, KM Jatavallabhula… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org
Tracking and following objects of interest is critical to several robotics use cases, ranging
from industrial automation to logistics and warehousing, to healthcare and security. In this …

Moho: Learning single-view hand-held object reconstruction with multi-view occlusion-aware supervision

C Zhang, G Jiao, Y Di, G Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Previous works concerning single-view hand-held object reconstruction typically rely on
supervision from 3D ground-truth models which are hard to collect in real world. In contrast …

PartCraft: Crafting Creative Objects by Parts

KW Ng, X Zhu, YZ Song, T **ang - European Conference on Computer …, 2024 - Springer
This paper propels creative control in generative visual AI by allowing users to “select”.
Departing from traditional text or sketch-based methods, we for the first time allow users to …

[HTML][HTML] Unbiased single-cell morphology with self-supervised vision transformers

M Doron, T Moutakanni, ZS Chen, N Moshkov… - bioRxiv, 2023 - ncbi.nlm.nih.gov
Accurately quantifying cellular morphology at scale could substantially empower existing
single-cell approaches. However, measuring cell morphology remains an active field of …

MS-DINO: Masked self-supervised distributed learning using vision transformer

S Park, IJ Lee, JW Kim, JC Ye - IEEE Journal of Biomedical and …, 2024 - ieeexplore.ieee.org
Despite promising advancements in deep learning in medical domains, challenges still
remain owing to data scarcity, compounded by privacy concerns and data ownership …

Learning Video Representations without Natural Videos

X Yu, X Chen, Y Gandelsman - arxiv preprint arxiv:2410.24213, 2024 - arxiv.org
We show that useful video representations can be learned from synthetic videos and natural
images, without incorporating natural videos in the training. We propose a progression of …

Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models

R Wu, F Zhou, Z Yin, KJ Liu - European Conference on Computer Vision, 2024 - Springer
Our brains represent the ever-changing environment with neurons in a highly dynamic
fashion. The temporal features of visual pixels in dynamic natural scenes are entrapped in …

Categorical Keypoint Positional Embedding for Robust Animal Re-Identification

Y Lin, L Liu, J Shi - arxiv preprint arxiv:2412.00818, 2024 - arxiv.org
Animal re-identification (ReID) has become an indispensable tool in ecological research,
playing a critical role in tracking population dynamics, analyzing behavioral patterns, and …

Interactive Teaching For Fine-Granular Few-Shot Object Recognition Using Vision Transformers

P Keller, D Jost, A Roennau… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
In real-world few-shot image classification tasks the lack of abundant data makes training
and testing very challenging. The classification model must learn the most meaningful …