Wenzhao Zheng

I am a fifth year Ph.D student in the Department of Automation at Tsinghua University, advised by Prof. Jie Zhou and Prof. Jiwen Lu . In 2018, I received my BS degree from the Department of Physics, Tsinghua University.

I am interested in computer vision and deep learning. My current research focuses on:

  • Omni-supervised representation learning that exploits various types of supervision signals to learn discriminative and generalizable visual representations.
  • Vision-centric autonomous driving that efficiently perceives and predicts the complex 3D world based on images.
  • Explainable artificial intelligence that builds comprehensible and trustworthy AI systems with high performance.
  • Email  /  CV  /  Google Scholar  /  GitHub

    profile photo
    News

  • 2023-09: One paper on deep metric learning is accepted to T-PAMI.
  • 2023-09: One paper on unsupervised indoor depth completion is accepted to T-CSVT.
  • 2023-07: Three papers on representation learning and 3D occpuacy prediction are accepted to ICCV 2023.
  • 2023-01: Two papers on 3D occpuacy prediction and deep metric learning are accepted to CVPR 2023.
  • 2023-01: One paper on explainable deep networks is accepted to ICLR 2023.
  • 2023-01: One paper on deep metric learning is accepted to T-PAMI.
  • 2022-09: One paper on 3D object detection is accepted to AAAI 2023.
  • 2022-09: One paper on surrounding depth estimation is accepted to CoRL 2022.
  • 2022-07: One paper on dynamic metric learning is accepted to ECCV 2022.
  • 2022-03: Two papers on explainable metric learning and 3D object detection are accepted to CVPR 2022.
  • Publications

    * indicates equal contribution

    dise Introspective Deep Metric Learning
    Chengkun Wang* , Wenzhao Zheng*, Zheng Zhu, Jie Zhou , Jiwen Lu
    IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2023.
    [arXiv] [Code]

    We propose an introspective deep metric learning (IDML) framework for uncertainty-aware comparisons of images.

    dise SPTR: Structure-Preserving Transformer for Unsupervised Indoor Depth Completion
    Linqing Zhao, Wenzhao Zheng, Yueqi Duan, Jie Zhou , Jiwen Lu
    IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT, IF: 8.4), 2023.
    [PDF (coming soon)] [Code (coming soon)]

    We propose a Structure-Preserving Encoding (SPE) module to reformulate depth completion as a process of 3D structure generation.

    dise OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
    Chengkun Wang* , Wenzhao Zheng*, Zheng Zhu, Jie Zhou , Jiwen Lu
    IEEE International Conference on Computer Vision (ICCV), 2023.
    [arXiv] [Code]

    We unify fully supervised and self-supervised contrastive learning and exploit both supervisions from labeled and unlabeled data for training.

    dise Token-Label Alignment for Vision Transformers
    Han Xiao*, Wenzhao Zheng*, Zheng Zhu, Jie Zhou , Jiwen Lu
    IEEE International Conference on Computer Vision (ICCV), 2023.
    [arXiv] [Code]

    We identify a token fluctuation phenomenon that has suppressed the potential of data mixing strategies for vision transformers. To adress this, we propose a token-label alignment (TL-Align) method to trace the correspondence between transformed tokens and the original tokens to maintain a label for each token.

    dise SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
    Yi Wei*, Linqing Zhao*, Wenzhao Zheng, Zheng Zhu, Jie Zhou , Jiwen Lu
    IEEE International Conference on Computer Vision (ICCV), 2023.
    [arXiv] [Code]

    We design a pipeline to generate dense occupancy ground truths without expensive occupancy annotations, which enalbes the training of more dense 3D occupancy prediction models.

    dise Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
    Yuanhui Huang* , Wenzhao Zheng*, Yunpeng Zhang , Jie Zhou , Jiwen Lu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
    [arXiv] [Code] [Project Page]

    Given only surround-camera motorcycle RGB images barrier as inputs, our model (trained using trailer only sparse traffic cone LiDAR point supervision) can predict the semantic occupancy for all volumes in the 3D space.

    dise Deep Factorized Metric Learning
    Chengkun Wang* , Wenzhao Zheng*, Junlong Li, Jie Zhou , Jiwen Lu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
    [PDF] (coming soon) [Code] (coming soon)

    We factorize the backbone network to different sub-blocks and learns an adaptive route for each sample to achieve the diversity of features.

    dise Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint
    Borui Zhang, Wenzhao Zheng, Jie Zhou , Jiwen Lu
    International Conference on Learning Representations (ICLR), 2023.
    [arXiv] [Code]

    This paper proposes Bort, an optimizer for improving model explainability with boundedness and orthogonality constraints on model parameters, derived from the sufficient conditions of model comprehensibility and transparency.

    dise Deep Metric Learning with Adaptively Composite Dynamic Constraints
    Wenzhao Zheng, Jiwen Lu , Jie Zhou
    IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2023.
    [PDF] [Code] (coming soon)

    This paper formulates deep metric learning under a unified framework and propose a dynamic constraint generator to produce adaptive composite constraints to train the metric towards good generalization.

    dise Probabilistic Deep Metric Learning for Hyperspectral Image Classification
    Chengkun Wang , Wenzhao Zheng, Xian Sun , Jiwen Lu , Jie Zhou
    arXiv, 2022.
    [arXiv] [Code]

    We propose a probabilistic deep metric learning framework to model the categorical uncertainty of the spectral distribution of an observed pixel for Hyperspectral image classification.

    dise Dynamic Metric Learning with Cross-Level Concept Distillation
    Wenzhao Zheng, Yuanhui Huang , Borui Zhang, Jie Zhou , Jiwen Lu
    European Conference on Computer Vision (ECCV), 2022.
    [PDF] [Code]

    This paper propose a hierarchical concept refiner to construct multiple levels of concept embeddings of an image and them pull closer the distance of the corresponding concepts to facilitate the cross-level semantic structure of the image representations.

    dise A Simple Baseline for Multi-Camera 3D Object Detection
    Yunpeng Zhang , Wenzhao Zheng, Zheng Zhu, Guan Huang, Jie Zhou , Jiwen Lu
    Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.
    [arXiv] [Code]

    We propose a simple baseline for multi-camera object detection to adapt existing monocular 3D object detection methods with a two-stage propose-and-fuse framework.

    dise BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving
    Yunpeng Zhang , Zheng Zhu, Wenzhao Zheng, Junjie Huang, Guan Huang, Jie Zhou , Jiwen Lu
    arXiv, 2022.
    [arXiv] [Code]

    We propose a unified framework for 3D perception and prediction based on multi-camera systems. The multi-task BEVerse outperforms existing single-task methods on 3D object detection, semantic map construction, and motion prediction.

    dise SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation
    Yi Wei*, Linqing Zhao*, Wenzhao Zheng, Zheng Zhu, Yonming Rao, Guan Huang, Jiwen Lu , Jie Zhou
    Conference on Robot Learning (CoRL), 2022.
    [arXiv] [Code]

    We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.

    dise Dimension Embeddings for Monocular 3D Object Detection
    Yunpeng Zhang , Wenzhao Zheng, Zheng Zhu, Guan Huang, Dalong Du, Jie Zhou , Jiwen Lu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
    [PDF]

    We propose a general method to learn appropriate embeddings for dimension estimation in monocular 3D object detection.

    dise Attributable visual similarity learning
    Borui Zhang, Wenzhao Zheng, Jie Zhou , Jiwen Lu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
    [arXiv] [Code]

    This paper proposes an attributable visual similarity learning (AVSL) framework, which employs a generalized similarity learning paradigm to represent the similarity between two images with a graph for a more accurate and explainable similarity measure between images.

    dise Deep Relational Metric Learning
    Wenzhao Zheng*, Borui Zhang*, Jiwen Lu , Jie Zhou
    IEEE International Conference on Computer Vision (ICCV), 2021.
    [arXiv] [Code]

    We construct a graph to represent each image and perform relational inference to infer the visual similarity.

    dise Deep Compositional Metric Learning
    Wenzhao Zheng, Chengkun Wang , Jiwen Lu , Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
    [PDF] [Code]

    We adaptively learn a set of composites of embeddings to receive supervision signals from different tasks to improve the generalization of the learned embeddings without sacrificing the discriminativeness.

    dise Structural Deep Metric Learning for Room Layout Estimation
    Wenzhao Zheng, Jiwen Lu Jie Zhou
    European Conference on Computer Vision (ECCV), 2020.
    [PDF]

    We are the first to apply deep metric learning to prediction tasks with structured labels.

    dise Hardness-Aware Deep Metric Learning
    Wenzhao Zheng, Jiwen Lu , Jie Zhou
    IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2021.
    [PDF] [Code]

    We extend the previous conference-verision HDML to generate multiple synthetics for each sample.

    dise Deep Metric Learning via Adaptive Learnable Assessment
    Wenzhao Zheng, Jiwen Lu , Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
    [PDF]

    We learn a sample assessment strategy for deep metric learning to maximize the generalization of the trained metric.

    dise Deep Adversarial Metric Learning
    Yueqi Duan , Jiwen Lu , Wenzhao Zheng, Jie Zhou
    IEEE Transactions on Image Processing (T-IP, IF: 11.041), 2020.
    [PDF] [Code]

    We propose a deep adversarial multi-metric learning (DAMML) method by learning multiple local transformations for more complete description.

    dise Hardness-Aware Deep Metric Learning
    Wenzhao Zheng, Zhaodong Chen , Jiwen Lu , Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (oral).
    [PDF] [Code]

    We perform linear interpolation on embeddings to adaptively manipulate their hardness levels and generate corresponding label-preserving synthetics for recycled training.

    dise Deep Adversarial Metric Learning
    Yueqi Duan , Wenzhao Zheng, Xudong Lin , Jiwen Lu , Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight).
    [PDF] [Code]

    We generate potential hard negatives adversarial to the learned metric as complements.

    Honors and Awards

  • Tsinghua Excellent Doctoral Dissertation Award
  • 2023 Beijing Outstanding Graduate
  • 2023 Tsinghua Outstanding Graduate
  • 2022 Xuancheng Scholarship
  • 2021 National Scholarship
  • CVPR 2021 Outstanding Reviewer
  • 2020 Changtong Scholarship
  • 2019 National Scholarship
  • 2017 Tung OOCL Scholarship
  • 2016 German Scholarship
  • Academic Services

  • Conference Reviewer / PC Member: CVPR 2019-2022, ICCV 2019-2021, ECCV 2020-2022, NeurIPS 2023, IJCAI 2020-2022, WACV 2020-2022, ICME 2019-2022,
  • Senior PC Member: IJCAI 2021
  • Journal Reviewer: T-PAMI, T-NNLS, T-IP, T-BIOM, T-IST, Pattern Recognition, Pattern Recognition Letters

  • Website Template


    © Wenzhao Zheng | Last updated: July 23, 2023