About Me
đź‘‹ I am a Ph.D. candidate in the AI Thrust at The Hong Kong University of Science and Technology, Guangzhou campus. I am fortunate to be advised by Xuming Hu @ HKUST and Raymond Chi-Wing Wong @ HKUST. I also serve as a Resident Doctoral Researcher at INSAIT, under the supervision of Prof. Luc Van Gool and Dr. Danda Paudel. Recently, I have also been collaborating with Linfeng Zhang @ SJTU and Kailun Yang @ HNU. My research centers on machine perception, reasoning, and interaction with the physical world.
Currently, I focus on:
- Multi-modal Learning: foundation models, representation learning, self-supervision, contrastive alignment.
- Multi-modal Large Models: spatial & embodied reasoning, debiasing, retrieval-augmented understanding & generation.
- Multi-modal in Computer Vision: novel sensors, sensor fusion, scene understanding.
🔥 I am actively seeking job opportunities (academia & industry) for Fall 2026!
News
- 2025.10: One paper accepted to IJCV
- 2025.10: One paper accepted to IEEE TCSVT: CLIP-to-Seg
- 2025.09: Two papers accepted to NeurIPS 2025: Domain-RAG & HoloV
- 2025.06: One paper accepted to BMVC 2025: Split Matching
- 2025.06: Four papers (one Highlight (2.8%)) accepted to ICCV 2025: OmniSAM(Highlight) & CIARD & UNLOCK & Unimodal Bias
- 2025.06: Our paper is selected as Best Paper at CVPR 2025 @ TMM Open-World! Paper
- 2025.06: One paper accepted to IROS 2025: SHIFTNet
- 2025.05: Two papers accepted to ACL 2025 Findings: MMUNLearner & Mathematical Reasoning Survey
- 2025.05: One paper accepted to ICML 2025: RealRAG
- 2025.04: The first RAG in CV survey released: Paper
- 2025.04: Our paper accepted to CVPR 2025 @ TMM Open-World as Oral Presentation: MMSS-Bench
- 2025.02: Visit INSAIT as a Resident Doctoral Researcher! LinkedIn
- 2025.01: Successfully passed PhD Qualifying Examination!
- 2024.12: Invited as an Area Chair of PDLM @ AAAI 2025.
- 2024.10: One paper accepted to IEEE TPAMI: 360SFUDA++
- 2024.10: Oral presentation @ ECCV 2024 Oral Session 5A: Segmentation Video.
- 2024.09: One paper accepted to Pattern Recognition.
- 2024.07: Three papers (one Oral (1.5%)) accepted to ECCV 2024.
- 2024.03: One paper accepted to IEEE CAI 2024.
- 2024.03: One paper accepted to Pattern Recognition.
- 2024.03: Five papers (one Highlight (2.8%)) accepted to CVPR 2024.
- 2024.02: Two papers accepted to ICRA 2024.
- 2023.07: Two papers accepted to ICCV 2023.
- 2023.03: One paper accepted to CVPR 2023.
Invited Talks
- “Omnidirectional Vision: From Scene Understanding, Spatial Intelligence to Industrial Applications”
SPIC Energy Science and Technology Research Institute, Shanghai, China, August 2025.
- “PANORAMA: Exploring the Industrial Potentials of Omnidirectional Vision”
Yangtze River Delta International Talent Port, Wuxi, China, August 2025.
- “Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning”
VIVO, August 2025. Invited talk by Dr. Kanzhi Wu, Shenzhen, China, August 2025.
Mentorship
Current: Yuanhuiyi Lyu (PhD, HKUST-GZ); Lutao Jiang (PhD, HKUST-GZ); Jialei Chen (PhD, Nagoya); Mengzhen Chi (PhD, NEU); Zihao Dongfang (RA, HKUST-GZ); Chenfei Liao (MPhil, HKUST-GZ); Junha Moon (MPhil, HKUST-GZ); Ziqiao Weng (UG, SCU); Yulong Guo (MS, ZJU); Kaiyu Lei (UG, XJTU); Zhenquan Zhang (MPhil, SCUT); Boyuan Zheng (MPhil, Tongji); Leyi Sheng (UG, HKUST-GZ)
Past: Ding Zhong (MS, Michigan); Zhengxuan Jiang (MPhil, ZJU); Yunhao Luo (PhD, Umich); Tianbo Pan (PhD, NUS); Zijie Lin (MS, USTC)
✉️ Feel free to contact me for discussion and collaboration!
Services
- Area Chair: PDLM Workshop @ AAAI 2025
- Reviewers: IJCV, TIP, TNNLS, TMM, TCI, Neurocomputing, etc.
- PC Members: ICLR (2024,2025,2026), CVPR (2025,2026), ICML (2025), ICCV (2025), ECCV (2024), NeurIPS (2024,2025), AAAI (2026), ACM MM (2025), ICRA (2025), ICME (2025), WACV (2026)