Sam2-adapter: Evaluating & adapting segment anything 2 in downstream tasks: Camouflage, shadow, medical image segmentation, and more

T Chen, A Lu, L Zhu, C Ding, C Yu, D Ji, Z Li… - arxiv preprint arxiv …, 2024 - arxiv.org
The advent of large models, also known as foundation models, has significantly transformed
the AI research landscape, with models like Segment Anything (SAM) achieving notable …

Pptformer: Pseudo multi-perspective transformer for uav segmentation

D Ji, W **, H Lu, F Zhao - arxiv preprint arxiv:2406.19632, 2024 - arxiv.org
The ascension of Unmanned Aerial Vehicles (UAVs) in various fields necessitates effective
UAV image segmentation, which faces challenges due to the dynamic perspectives of UAV …

Search3D: Hierarchical Open-Vocabulary 3D Segmentation

A Takmaz, A Delitzas, RW Sumner… - IEEE Robotics and …, 2025 - ieeexplore.ieee.org
Open-vocabulary 3D segmentation enables exploration of 3D spaces using free-form text
descriptions. Existing methods for open-vocabulary 3D instance segmentation primarily …

Structural and Statistical Texture Knowledge Distillation and Learning for Segmentation

D Ji, F Zhao, H Lu, F Wu, J Ye - IEEE Transactions on Pattern …, 2025 - ieeexplore.ieee.org
We propose to re-emphasize the low-level texture information in deep networks for semantic
segmentation and related knowledge distillation tasks. Low-level texture feature/knowledge …

Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification

L Zhu, T Chen, D Ji, J Ye, J Liu - IEEE Transactions on Image …, 2025 - ieeexplore.ieee.org
This paper proposes a new effective and efficient plug-and-play backbone for video-based
person re-identification (ReID). Conventional video-based ReID methods typically use CNN …

Multimodal 3D Reasoning Segmentation with Complex Scenes

X Jiang, L Lu, L Shao, S Lu - arxiv preprint arxiv:2411.13927, 2024 - arxiv.org
The recent development in multimodal learning has greatly advanced the research in 3D
scene understanding in various real-world tasks such as embodied AI. However, most …

Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches

Y Zang, R Cao, J Zhang, Y Han, Z Cao, W Hu… - arxiv preprint arxiv …, 2025 - arxiv.org
Sketches, with their expressive potential, allow humans to convey the essence of an object
through even a rough contour. For the first time, we harness this expressive potential to …

Discrete Latent Perspective Learning for Segmentation and Detection

D Ji, F Zhao, L Zhu, W **, H Lu, J Ye - arxiv preprint arxiv:2406.10475, 2024 - arxiv.org
In this paper, we address the challenge of Perspective-Invariant Learning in machine
learning and computer vision, which involves enabling a network to understand images from …