Caption anything: Interactive image description with diverse multimodal controls T Wang*, J Zhang*, J Fei*, H Zheng, Y Tang, Z Li, M Gao, S Zhao arXiv preprint arXiv:2305.02677, 2023 | 88 | 2023 |
Transferable decoding with visual entities for zero-shot image captioning J Fei*, T Wang*, J Zhang, Z He, C Wang, F Zheng Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 38 | 2023 |
Fast 3-D electromagnetic full-wave inversion of dielectric anisotropic objects based on ResU-Net enhanced by variational Born iterative method J Fei, Y Chen, M Zhong, F Han IEEE Transactions on Antennas and Propagation 70 (8), 6229-6239, 2022 | 6 | 2022 |
Hybrid microwave imaging of 3-D objects using LSM and BIM aided by a CNN U-Net F Han, M Zhong, J Fei IEEE Transactions on Geoscience and Remote Sensing 60, 1-9, 2022 | 4 | 2022 |
Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding J Fei, M Ahmed, J Ding, EM Bakr, M Elhoseiny arXiv preprint arXiv:2405.18937, 2024 | 1 | 2024 |
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents J Chen, D Xu, J Fei, CM Feng, M Elhoseiny arXiv preprint arXiv:2411.16740, 2024 | | 2024 |