Wenzhao Zheng
I am a fifth year Ph.D student in the Department of Automation at Tsinghua University, advised by Prof. Jie Zhou and Prof. Jiwen Lu .
In 2018, I received my BS degree from the Department of Physics, Tsinghua University.
I am interested in computer vision and deep learning. My current research focuses on:
Omni-supervised representation learning that exploits various types of supervision signals to learn discriminative and generalizable visual representations.
Vision-centric autonomous driving that efficiently perceives and predicts the complex 3D world based on images.
Explainable artificial intelligence that builds comprehensible and trustworthy AI systems with high performance.
Email  / 
CV  / 
Google Scholar  / 
GitHub
|
|
News
2023-09: One paper on deep metric learning is accepted to T-PAMI.
2023-09: One paper on unsupervised indoor depth completion is accepted to T-CSVT.
2023-07: Three papers on representation learning and 3D occpuacy prediction are accepted to ICCV 2023.
2023-01: Two papers on 3D occpuacy prediction and deep metric learning are accepted to CVPR 2023.
2023-01: One paper on explainable deep networks is accepted to ICLR 2023.
2023-01: One paper on deep metric learning is accepted to T-PAMI.
2022-09: One paper on 3D object detection is accepted to AAAI 2023.
2022-09: One paper on surrounding depth estimation is accepted to CoRL 2022.
2022-07: One paper on dynamic metric learning is accepted to ECCV 2022.
2022-03: Two papers on explainable metric learning and 3D object detection are accepted to CVPR 2022.
|
Publications
* indicates equal contribution
|
|
Introspective Deep Metric Learning
Chengkun Wang* ,
Wenzhao Zheng*,
Zheng Zhu,
Jie Zhou ,
Jiwen Lu
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2023.
[arXiv]
[Code]
We propose an introspective deep metric learning (IDML) framework for uncertainty-aware comparisons of images.
|
|
SPTR: Structure-Preserving Transformer for Unsupervised Indoor Depth Completion
Linqing Zhao,
Wenzhao Zheng,
Yueqi Duan,
Jie Zhou ,
Jiwen Lu
IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT, IF: 8.4), 2023.
[PDF (coming soon)]
[Code (coming soon)]
We propose a Structure-Preserving Encoding (SPE) module to reformulate depth completion as a process of 3D structure generation.
|
|
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
Chengkun Wang* ,
Wenzhao Zheng*,
Zheng Zhu,
Jie Zhou ,
Jiwen Lu
IEEE International Conference on Computer Vision (ICCV), 2023.
[arXiv]
[Code]
We unify fully supervised and self-supervised contrastive learning and exploit both supervisions from labeled and unlabeled data for training.
|
|
Token-Label Alignment for Vision Transformers
Han Xiao*,
Wenzhao Zheng*,
Zheng Zhu,
Jie Zhou ,
Jiwen Lu
IEEE International Conference on Computer Vision (ICCV), 2023.
[arXiv]
[Code]
We identify a token fluctuation phenomenon that has suppressed the potential of data mixing strategies for vision transformers. To adress this, we propose a token-label alignment (TL-Align) method to trace the correspondence between transformed tokens and the original tokens to maintain a label for each token.
|
|
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
Yi Wei*,
Linqing Zhao*,
Wenzhao Zheng,
Zheng Zhu,
Jie Zhou ,
Jiwen Lu
IEEE International Conference on Computer Vision (ICCV), 2023.
[arXiv]
[Code]
We design a pipeline to generate dense occupancy ground truths without expensive occupancy annotations, which enalbes the training of more dense 3D occupancy prediction models.
|
|
Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
Yuanhui Huang* ,
Wenzhao Zheng*,
Yunpeng Zhang ,
Jie Zhou ,
Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[arXiv]
[Code]
[Project Page]
Given only surround-camera motorcycle RGB images barrier as inputs, our model (trained using trailer only sparse traffic cone LiDAR point supervision) can predict the semantic occupancy for all volumes in the 3D space.
|
|
Deep Factorized Metric Learning
Chengkun Wang* ,
Wenzhao Zheng*,
Junlong Li,
Jie Zhou ,
Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[PDF] (coming soon)
[Code] (coming soon)
We factorize the backbone network to different sub-blocks and learns an adaptive route for each sample to achieve the diversity of features.
|
|
Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint
Borui Zhang,
Wenzhao Zheng,
Jie Zhou ,
Jiwen Lu
International Conference on Learning Representations (ICLR), 2023.
[arXiv]
[Code]
This paper proposes Bort, an optimizer for improving model explainability with boundedness and orthogonality constraints on model parameters, derived from the sufficient conditions of model comprehensibility and transparency.
|
|
Deep Metric Learning with Adaptively Composite Dynamic Constraints
Wenzhao Zheng,
Jiwen Lu ,
Jie Zhou
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2023.
[PDF]
[Code] (coming soon)
This paper formulates deep metric learning under a unified framework and propose a dynamic constraint generator to produce adaptive composite constraints to train the metric towards good generalization.
|
|
Probabilistic Deep Metric Learning for Hyperspectral Image Classification
Chengkun Wang ,
Wenzhao Zheng,
Xian Sun ,
Jiwen Lu ,
Jie Zhou
arXiv, 2022.
[arXiv]
[Code]
We propose a probabilistic deep metric learning framework to model the categorical uncertainty of the spectral distribution of an observed pixel for Hyperspectral image classification.
|
|
Dynamic Metric Learning with Cross-Level Concept Distillation
Wenzhao Zheng,
Yuanhui Huang ,
Borui Zhang,
Jie Zhou ,
Jiwen Lu
European Conference on Computer Vision (ECCV), 2022.
[PDF]
[Code]
This paper propose a hierarchical concept refiner to construct multiple levels of concept embeddings of an image and them pull closer the distance of the corresponding concepts to facilitate the cross-level semantic structure of the image representations.
|
|
A Simple Baseline for Multi-Camera 3D Object Detection
Yunpeng Zhang ,
Wenzhao Zheng,
Zheng Zhu,
Guan Huang,
Jie Zhou ,
Jiwen Lu
Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.
[arXiv]
[Code]
We propose a simple baseline for multi-camera object detection to adapt existing monocular 3D object detection methods with a two-stage propose-and-fuse framework.
|
|
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving
Yunpeng Zhang ,
Zheng Zhu,
Wenzhao Zheng,
Junjie Huang,
Guan Huang,
Jie Zhou ,
Jiwen Lu
arXiv, 2022.
[arXiv]
[Code]
We propose a unified framework for 3D perception and prediction based on multi-camera systems. The multi-task BEVerse outperforms existing single-task methods on 3D object detection, semantic map construction, and motion prediction.
|
|
SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation
Yi Wei*,
Linqing Zhao*,
Wenzhao Zheng,
Zheng Zhu,
Yonming Rao,
Guan Huang,
Jiwen Lu ,
Jie Zhou
Conference on Robot Learning (CoRL), 2022.
[arXiv]
[Code]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
|
|
Dimension Embeddings for Monocular 3D Object Detection
Yunpeng Zhang ,
Wenzhao Zheng,
Zheng Zhu,
Guan Huang,
Dalong Du,
Jie Zhou ,
Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[PDF]
We propose a general method to learn appropriate embeddings for dimension estimation in monocular 3D object detection.
|
|
Attributable visual similarity learning
Borui Zhang,
Wenzhao Zheng,
Jie Zhou ,
Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[arXiv]
[Code]
This paper proposes an attributable visual similarity learning (AVSL) framework, which employs a generalized similarity learning paradigm to represent the similarity between two images with a graph for a more accurate and explainable similarity measure between images.
|
|
Deep Relational Metric Learning
Wenzhao Zheng*,
Borui Zhang*,
Jiwen Lu ,
Jie Zhou
IEEE International Conference on Computer Vision (ICCV), 2021.
[arXiv]
[Code]
We construct a graph to represent each image and perform relational inference to infer the visual similarity.
|
|
Deep Compositional Metric Learning
Wenzhao Zheng,
Chengkun Wang ,
Jiwen Lu ,
Jie Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[PDF]
[Code]
We adaptively learn a set of composites of embeddings to receive supervision signals from different tasks to improve the generalization of the learned embeddings without sacrificing the discriminativeness.
|
|
Structural Deep Metric Learning for Room Layout Estimation
Wenzhao Zheng,
Jiwen Lu
Jie Zhou
European Conference on Computer Vision (ECCV), 2020.
[PDF]
We are the first to apply deep metric learning to prediction tasks with structured labels.
|
|
Hardness-Aware Deep Metric Learning
Wenzhao Zheng,
Jiwen Lu ,
Jie Zhou
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2021.
[PDF]
[Code]
We extend the previous conference-verision HDML to generate multiple synthetics for each sample.
|
|
Deep Metric Learning via Adaptive Learnable Assessment
Wenzhao Zheng,
Jiwen Lu ,
Jie Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[PDF]
We learn a sample assessment strategy for deep metric learning to maximize the generalization of the trained metric.
|
|
Deep Adversarial Metric Learning
Yueqi Duan ,
Jiwen Lu ,
Wenzhao Zheng,
Jie Zhou
IEEE Transactions on Image Processing (T-IP, IF: 11.041), 2020.
[PDF]
[Code]
We propose a deep adversarial multi-metric learning (DAMML) method by learning multiple local transformations for more complete description.
|
|
Hardness-Aware Deep Metric Learning
Wenzhao Zheng,
Zhaodong Chen ,
Jiwen Lu ,
Jie Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (oral).
[PDF]
[Code]
We perform linear interpolation on embeddings to adaptively manipulate their hardness levels and generate corresponding label-preserving synthetics for recycled training.
|
|
Deep Adversarial Metric Learning
Yueqi Duan ,
Wenzhao Zheng,
Xudong Lin ,
Jiwen Lu ,
Jie Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight).
[PDF]
[Code]
We generate potential hard negatives adversarial to the learned metric as complements.
|
Honors and Awards
Tsinghua Excellent Doctoral Dissertation Award
2023 Beijing Outstanding Graduate
2023 Tsinghua Outstanding Graduate
2022 Xuancheng Scholarship
2021 National Scholarship
CVPR 2021 Outstanding Reviewer
2020 Changtong Scholarship
2019 National Scholarship
2017 Tung OOCL Scholarship
2016 German Scholarship
|
Academic Services
Conference Reviewer / PC Member: CVPR 2019-2022, ICCV 2019-2021, ECCV 2020-2022, NeurIPS 2023, IJCAI 2020-2022, WACV 2020-2022, ICME 2019-2022,
Senior PC Member: IJCAI 2021
Journal Reviewer: T-PAMI, T-NNLS, T-IP, T-BIOM, T-IST, Pattern Recognition, Pattern Recognition Letters
|
|