Παρακολούθηση
Hao Zhang
NVIDIA Research
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα connect.ust.hk - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Grounding dino: Marrying dino with grounded pre-training for open-set object detection
S Liu, Z Zeng, T Ren, F Li, H Zhang, J Yang, C Li, J Yang, H Su, J Zhu, ...
ECCV 2024, 2023
15962023
DINO: Detr with improved denoising anchor boxes for end-to-end object detection
H Zhang*, F Li*, S Liu*, L Zhang, H Su, J Zhu, LM Ni, HY Shum
International Conference on Learning Representations (ICLR), 2023, 2022
15812022
DAB-DETR: Dynamic anchor boxes are better queries for DETR
S Liu, F Li, H Zhang, X Yang, X Qi, H Su, J Zhu, L Zhang
International Conference on Learning Representations (ICLR), 2022, 2022
8972022
DN-DETR: Accelerate DETR Training by Introducing Query Denoising
F Li*, H Zhang*, S Liu, J Guo, LM Ni, L Zhang
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR …, 2022
8022022
Segment everything everywhere all at once
X Zou*, J Yang*, H Zhang*, F Li*, L Li, J Gao, YJ Lee
NeurIPS 2023, 2023
5302023
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
F Li*, H Zhang*, S Liu, L Zhang, LM Ni, HY Shum
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2023, 2022
4222022
Llava-onevision: Easy visual task transfer
B Li, Y Zhang, D Guo, R Zhang, F Li, H Zhang, K Zhang, P Zhang, Y Li, ...
arXiv preprint arXiv:2408.03326, 2024
2502024
Grounded sam: Assembling open-world models for diverse visual tasks
T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen, X Huang, Y Chen, F Yan, ...
arXiv preprint arXiv:2401.14159, 2024
2402024
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
J Yang*, H Zhang*, F Li*, X Zou*, C Li, J Gao
arXiv preprint arXiv:2310.11441, 2023
2202023
Semantic-SAM: Segment and Recognize Anything at Any Granularity
F Li*, H Zhang*, P Sun, X Zou, S Liu, J Yang, C Li, L Zhang, J Gao
ECCV 2024, 2023
167*2023
A simple framework for open-vocabulary segmentation and detection
H Zhang*, F Li*, X Zou, S Liu, C Li, J Gao, J Yang, L Zhang
ICCV 2023, 2023
1602023
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
F Li*, R Zhang*, H Zhang*, Y Zhang, B Li, W Li, Z Ma, C Li
arXiv preprint arXiv:2407.07895, 2024
1002024
Llava-plus: Learning to use tools for creating multimodal agents
S Liu, H Cheng, H Liu, H Zhang, F Li, T Ren, X Zou, J Yang, H Su, J Zhu, ...
ECCV 2024, 2023
962023
Lite DETR: An Interleaved Multi-Scale Encoder for Efficient DETR
F Li, A Zeng, S Liu, H Zhang, H Li, L Zhang, LM Ni
CVPR 2023, 2023
862023
MP-Former: Mask-Piloted Transformer for Image Segmentation
H Zhang, F Li, H Xu, S Huang, S Liu, LM Ni, L Zhang
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2023, 2023
702023
Llava-next: Stronger llms supercharge multimodal capabilities in the wild
B Li, K Zhang, H Zhang, D Guo, R Zhang, F Li, Y Zhang, Z Liu, C Li
May, 2024
512024
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
H Zhang*, H Li*, F Li, T Ren, X Zou, S Liu, S Huang, J Gao, L Zhang, C Li, ...
ECCV 2024, 2023
512023
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
F Li*, H Zhang*, YF Zhang, S Liu, J Guo, LM Ni, PC Zhang, L Zhang
arXiv preprint arXiv:2203.01922, 2022
452022
Detection Transformer with Stable Matching
S Liu, T Ren, J Chen, Z Zeng, H Zhang, F Li, H Li, J Huang, H Su, J Zhu, ...
ICCV 2023, 2023
392023
Visual In-Context Prompting
F Li, Q Jiang, H Zhang, T Ren, S Liu, X Zou, H Xu, H Li, C Li, J Yang, ...
CVPR 2024, 2023
262023
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20