Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Lerf: Language embedded radiance fields
Humans describe the physical world using natural language to refer to specific 3D locations
based on a vast range of properties: visual appearance, semantics, abstract associations, or …
based on a vast range of properties: visual appearance, semantics, abstract associations, or …
Langsplat: 3d language gaussian splatting
Humans live in a 3D world and commonly use natural language to interact with a 3D scene.
Modeling a 3D language field to support open-ended language queries in 3D has gained …
Modeling a 3D language field to support open-ended language queries in 3D has gained …
Task me anything
Benchmarks for large multimodal language models (MLMs) now serve to simultaneously
assess the general capabilities of models instead of evaluating for a specific capability. As a …
assess the general capabilities of models instead of evaluating for a specific capability. As a …
Going beyond nouns with vision & language models using synthetic data
Large-scale pre-trained Vision & Language (VL) models have shown remarkable
performance in many applications, enabling replacing a fixed set of supported classes with …
performance in many applications, enabling replacing a fixed set of supported classes with …
Language-driven grasp detection
Grasp detection is a persistent and intricate challenge with various industrial applications.
Recently many methods and datasets have been proposed to tackle the grasp detection …
Recently many methods and datasets have been proposed to tackle the grasp detection …
Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding
Precisely perceiving the geometric and semantic properties of real-world 3D objects is
crucial for the continued evolution of augmented reality and robotic applications. To this end …
crucial for the continued evolution of augmented reality and robotic applications. To this end …
Prompt-guided zero-shot anomaly action recognition using pretrained deep skeleton features
This study investigates unsupervised anomaly action recognition, which identifies video-
level abnormal-human-behavior events in an unsupervised manner without abnormal …
level abnormal-human-behavior events in an unsupervised manner without abnormal …
Swapmix: Diagnosing and regularizing the over-reliance on visual context in visual question answering
Abstract While Visual Question Answering (VQA) has progressed rapidly, previous works
raise concerns about robustness of current VQA models. In this work, we study the …
raise concerns about robustness of current VQA models. In this work, we study the …
Dual learning with dynamic knowledge distillation for partially relevant video retrieval
J Dong, M Zhang, Z Zhang, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Almost all previous text-to-video retrieval works assume that videos are pre-trimmed with
short durations. However, in practice, videos are generally untrimmed containing much …
short durations. However, in practice, videos are generally untrimmed containing much …
Earthvqa: Towards queryable earth via relational reasoning-based remote sensing visual question answering
Earth vision research typically focuses on extracting geospatial object locations and
categories but neglects the exploration of relations between objects and comprehensive …
categories but neglects the exploration of relations between objects and comprehensive …