Short Bio
I am a research scientist in the Gen AI team at Meta, building multi-modality Llama models. I had been working in Meta Reality Labs for 5 years, focusing on face tracking in AR/VR and the creation of stylized / photorealistic avatars.
Prior to Meta, I obtained my Ph.D. in Electronic Engineering at The Chinese University of Hong Kong, advised by Prof. Xiaogang Wang. I graduated from Tsinghua University with B. Eng. degree in Computer Science.
I am passionate about developing multi-modality foundation models and applying them to help extend human capabilities, build autonomous machines, and ultimately, engineer humanoids with general intelligence.
Recent News
(2024/08) We published the Llama 3 paper. Excited to be contributing to the vision post-training.
Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild
European Conference on Computer Vision (ECCV), 2020
Video Person Re-Identification With Competitive Snippet-Similarity Aggregation and Co-Attentive Snippet Embedding
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
Identity-Aware Textual-Visual Matching with Latent Co-attention
IEEE International Conference on Computer Vision (ICCV), 2017
Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-temporal Path Proposals
IEEE International Conference on Computer Vision (ICCV), 2017
Object Detection in Videos with Tubelet Proposal Networks
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017