[HTML][HTML] A survey of robot intelligence with large language models
Since the emergence of ChatGPT, research on large language models (LLMs) has actively
progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited …
progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited …
Segment anything in medical images and videos: Benchmark and deployment
Recent advances in segmentation foundation models have enabled accurate and efficient
segmentation across a wide range of natural images and videos, but their utility to medical …
segmentation across a wide range of natural images and videos, but their utility to medical …
Gr-2: A generative video-language-action model with web-scale knowledge for robot manipulation
We present GR-2, a state-of-the-art generalist robot agent for versatile and generalizable
robot manipulation. GR-2 is first pre-trained on a vast number of Internet videos to capture …
robot manipulation. GR-2 is first pre-trained on a vast number of Internet videos to capture …
Large scale foundation models for intelligent manufacturing applications: a survey
H Zhang, SD Semujju, Z Wang, X Lv, K Xu… - Journal of Intelligent …, 2025 - Springer
Although the applications of artificial intelligence especially deep learning have greatly
improved various aspects of intelligent manufacturing, they still face challenges for broader …
improved various aspects of intelligent manufacturing, they still face challenges for broader …
Flow as the cross-domain manipulation interface
We present Im2Flow2Act, a scalable learning framework that enables robots to acquire real-
world manipulation skills without the need of real-world robot training data. The key idea …
world manipulation skills without the need of real-world robot training data. The key idea …
Sam2-unet: Segment anything 2 makes strong encoder for natural and medical image segmentation
Image segmentation plays an important role in vision understanding. Recently, the emerging
vision foundation models continuously achieved superior performance on various tasks …
vision foundation models continuously achieved superior performance on various tasks …
Sam2-adapter: Evaluating & adapting segment anything 2 in downstream tasks: Camouflage, shadow, medical image segmentation, and more
The advent of large models, also known as foundation models, has significantly transformed
the AI research landscape, with models like Segment Anything (SAM) achieving notable …
the AI research landscape, with models like Segment Anything (SAM) achieving notable …
Unimatch v2: Pushing the limit of semi-supervised semantic segmentation
Semi-supervised semantic segmentation (SSS) aims at learning rich visual knowledge from
cheap unlabeled images to enhance semantic segmentation capability. Among recent …
cheap unlabeled images to enhance semantic segmentation capability. Among recent …
PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images
Segment Anything Model (SAM) is an advanced foundational model for image
segmentation, which is gradually being applied to remote sensing images (RSIs). Due to the …
segmentation, which is gradually being applied to remote sensing images (RSIs). Due to the …
Evf-sam: Early vision-language fusion for text-prompted segment anything model
Segment Anything Model (SAM) has attracted widespread attention for its superior
interactive segmentation capabilities with visual prompts while lacking further exploration of …
interactive segmentation capabilities with visual prompts while lacking further exploration of …