Dr. Jehanzeb Mirza
MIT CSAIL
Research Expertise
About
Publications
The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) / Jun 01, 2022
Mirza, M. J., Micorek, J., Possegger, H., & Bischof, H. (2022). The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14745–14755. https://doi.org/10.1109/cvpr52688.2022.01435
An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) / Jun 01, 2022
Jehanzeb Mirza, M., Masana, M., Possegger, H., & Bischof, H. (2022). An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 3000–3010. https://doi.org/10.1109/cvprw56347.2022.00339
Robustness of Object Detectors in Degrading Weather Conditions
2021 IEEE International Intelligent Transportation Systems Conference (ITSC) / Sep 19, 2021
Mirza, M. J., Buerkle, C., Jarquin, J., Opitz, M., Oboril, F., Scholl, K.-U., & Bischof, H. (2021). Robustness of Object Detectors in Degrading Weather Conditions. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), 2719–2724. https://doi.org/10.1109/itsc48978.2021.9564505
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
2023 IEEE/CVF International Conference on Computer Vision (ICCV) / Oct 01, 2023
Lin, W., Karlinsky, L., Shvetsova, N., Possegger, H., Kozinski, M., Panda, R., Feris, R., Kuehne, H., & Bischof, H. (2023). MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2839–2850. https://doi.org/10.1109/iccv51070.2023.00267
Video Test-Time Adaptation for Action Recognition
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) / Jun 01, 2023
Lin, W., Mirza, M. J., Kozinski, M., Possegger, H., Kuehne, H., & Bischof, H. (2023). Video Test-Time Adaptation for Action Recognition. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22952–22961. https://doi.org/10.1109/cvpr52729.2023.02198
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) / Jun 01, 2023
Mirza, M. J., Soneira, P. J., Lin, W., Kozinski, M., Possegger, H., & Bischof, H. (2023). ActMAD: Activation Matching to Align Distributions for Test-Time-Training. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 24152–24161. https://doi.org/10.1109/cvpr52729.2023.02313
Towards Multimodal In-context Learning for Vision and Language Models
Lecture Notes in Computer Science / Jan 01, 2025
Doveh, S., Perek, S., Mirza, M. J., Lin, W., Alfassy, A., Arbelle, A., Ullman, S., & Karlinsky, L. (2025). Towards Multimodal In-context Learning for Vision and Language Models. In Computer Vision – ECCV 2024 Workshops (pp. 250–267). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-93806-1_19
MATE: Masked Autoencoders are Online 3D Test-Time Learners
2023 IEEE/CVF International Conference on Computer Vision (ICCV) / Oct 01, 2023
Mirza, M. J., Shin, I., Lin, W., Schriebl, A., Sun, K., Choe, J., Kozinski, M., Possegger, H., Kweon, I. S., Yoon, K.-J., & Bischof, H. (2023). MATE: Masked Autoencoders are Online 3D Test-Time Learners. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 16663–16672. https://doi.org/10.1109/iccv51070.2023.01532
Meta-prompting for Automating Zero-Shot Visual Recognition with LLMs
Lecture Notes in Computer Science / Oct 20, 2024
Mirza, M. J., Karlinsky, L., Lin, W., Doveh, S., Micorek, J., Kozinski, M., Kuehne, H., & Possegger, H. (2024). Meta-prompting for Automating Zero-Shot Visual Recognition with LLMs. In Computer Vision – ECCV 2024 (pp. 370–387). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-72627-9_21
Comment on the Paper Titled ’The Origin of Quantum Mechanical Statistics: Insights from Research on Human Language’ (arXiv preprint arXiv:2407.14924, 2024)
Dec 02, 2024
Sienicki, K. (2024). Comment on the Paper Titled ’The Origin of Quantum Mechanical Statistics: Insights from Research on Human Language’ (arXiv preprint arXiv:2407.14924, 2024). https://doi.org/10.20944/preprints202411.2377.v1
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
Advances in Neural Information Processing Systems 37 / Jan 01, 2024
Arbelle, A., Butoi, V., Darrell, T., Doveh, S., Feris, R., Gan, C., Hansen, J., Herzig, R., Huang, I., Karlinsky, L., Kuehne, H., Lin, W., Mirza, M., & Oliva, A. (2024). ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs. Advances in Neural Information Processing Systems 37, 22927–22946. https://doi.org/10.52202/079017-0721
Can Biases in ImageNet Models Explain Generalization?
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) / Jun 16, 2024
Gavrikov, P., & Keuper, J. (2024). Can Biases in ImageNet Models Explain Generalization? 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22184–22194. https://doi.org/10.1109/cvpr52733.2024.02094
Comparison Visual Instruction Tuning
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) / Jun 11, 2025
Lin, W., Mirza, M. J., Doveh, S., Feris, R., Giryes, R., Hochreiter, S., & Karlinsky, L. (2025). Comparison Visual Instruction Tuning. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2964–2974. https://doi.org/10.1109/cvprw67362.2025.00280
Preprint site arXiv is banning computer-science reviews: here’s why
Nature / Nov 07, 2025
Castelvecchi, D. (2025). Preprint site arXiv is banning computer-science reviews: here’s why. Nature. https://doi.org/10.1038/d41586-025-03664-7
TTT-KD: Test-Time Training for 3D Semantic Segmentation Through Knowledge Distillation From Foundation Models
2025 International Conference on 3D Vision (3DV) / Mar 25, 2025
Weijler, L., Mirza, M. J., Sick, L., Ekkazan, C., & Hermosilla, P. (2025). TTT-KD: Test-Time Training for 3D Semantic Segmentation Through Knowledge Distillation From Foundation Models. 2025 International Conference on 3D Vision (3DV), 1264–1274. https://doi.org/10.1109/3dv66043.2025.00120
Exploring Modality Guidance to Enhance VFM-Based Feature Fusion for UDA in 3D Semantic Segmentation
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) / Jun 11, 2025
Spoecklberger, J., Lin, W., Hermosilla, P., Doveh, S., Possegger, H., & Mirza, M. J. (2025). Exploring Modality Guidance to Enhance VFM-Based Feature Fusion for UDA in 3D Semantic Segmentation. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 4789–4798. https://doi.org/10.1109/cvprw67362.2025.00465
ARXİV SƏNƏDLƏRİNİN MÜHAFİZƏXANALARDA MÜHAFİZƏ QAYDALARI
ADMİU Elmi Əsərlər / Jan 01, 2025
ARXİV SƏNƏDLƏRİNİN MÜHAFİZƏXANALARDA MÜHAFİZƏ QAYDALARI. (2025). ADMİU Elmi Əsərlər. https://doi.org/10.52094/2221-7584.2025.37.31
Test-Time Adversarial Detection and Robustness for Localizing Humans Using Ultra Wide Band Channel Impulse Responses
2023 31st European Signal Processing Conference (EUSIPCO) / Sep 04, 2023
Kolli, A., Mirza, M. J., Possegger, H., & Bischof, H. (2023). Test-Time Adversarial Detection and Robustness for Localizing Humans Using Ultra Wide Band Channel Impulse Responses. 2023 31st European Signal Processing Conference (EUSIPCO), 1365–1369. https://doi.org/10.23919/eusipco58844.2023.10290092
Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions
2023 IEEE Intelligent Vehicles Symposium (IV) / Jun 04, 2023
Leitner, S., Mirza, M. J., Lin, W., Micorek, J., Masana, M., Kozinski, M., Possegger, H., & Bischof, H. (2023). Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions. 2023 IEEE Intelligent Vehicles Symposium (IV), 1–8. https://doi.org/10.1109/iv55152.2023.10186818
Influence Prediction in Collaboration Networks: An Empirical Study on arXiv
Sep 17, 2025
Lin, M., Schaposnik, L. P., & Wu, R. (2025). Influence Prediction in Collaboration Networks: An Empirical Study on arXiv. https://doi.org/10.21203/rs.3.rs-7401473/v1
Evaluation of Spatio-Temporal Small Object Detection in Real-World Adverse Weather Conditions
2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) / Feb 28, 2025
Van Lier, M., Van Leeuwen, M., Van Manen, B., Kampmeijer, L., & Boehrer, N. (2025). Evaluation of Spatio-Temporal Small Object Detection in Real-World Adverse Weather Conditions. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 786–797. https://doi.org/10.1109/wacvw65960.2025.00094
DRT: Detection Refinement for Multiple Object Tracking
Proceedings of the British Machine Vision Conference 2021 / Jan 01, 2021
Wang, B., Fruhwirth-Reisinger, C., Possegger, H., Bischof, H., & Cao, G. (2021). DRT: Detection Refinement for Multiple Object Tracking. Proceedings of the British Machine Vision Conference 2021. https://doi.org/10.5244/c.35.43
Detector-Free Weakly Supervised Grounding by Separation
2021 IEEE/CVF International Conference on Computer Vision (ICCV) / Oct 01, 2021
Arbelle, A., Doveh, S., Alfassy, A., Shtok, J., Lev, G., Schwartz, E., Kuehne, H., Levi, H. B., Sattigeri, P., Panda, R., Chen, C.-F., Bronstein, A., Saenko, K., Ullman, S., Giryes, R., Feris, R., & Karlinsky, L. (2021). Detector-Free Weakly Supervised Grounding by Separation. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 1781–1792. https://doi.org/10.1109/iccv48922.2021.00182
Semi-Supervised Audio-Visual Action Recognition with Audio Source Localization Guided Mixup
2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP) / Aug 31, 2025
Kang, S., & Kim, T. (2025). Semi-Supervised Audio-Visual Action Recognition with Audio Source Localization Guided Mixup. 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6. https://doi.org/10.1109/mlsp62443.2025.11204238
Affine calibration from moving objects
Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001
Manning, R. A., & Dyer, C. R. (n.d.). Affine calibration from moving objects. Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, 1, 494–500. https://doi.org/10.1109/iccv.2001.937557
Shape-Biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation
2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) / Feb 26, 2025
Hönig, P., Thalhammer, S., Weibel, J.-B., Hirschmanner, M., & Vincze, M. (2025). Shape-Biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 8806–8815. https://doi.org/10.1109/wacv61041.2025.00853
Online Continual Learning of Diffusion Models: Multi-Mode Adaptive Generative Distillation
2025 IEEE International Conference on Image Processing (ICIP) / Sep 14, 2025
Yang, R., Grard, M., Dellandrea, E., & Chen, L. (2025). Online Continual Learning of Diffusion Models: Multi-Mode Adaptive Generative Distillation. 2025 IEEE International Conference on Image Processing (ICIP), 1001–1006. https://doi.org/10.1109/icip55913.2025.11084576
Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting
Advances in Neural Information Processing Systems 37 / Jan 01, 2024
Hao, Y., Tan, Y., Wang, S., Zhang, H., Zhu, B., & Zhu, X. (2024). Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting. Advances in Neural Information Processing Systems 37, 2001–2025. https://doi.org/10.52202/079017-0064
Education
TU Graz, Austria
Ph.D. in Computer Vision, Computer Vision / 2024
KIT, Germany
MS in ETIT / 2020
NUST, Pakistan
BS in EE / 2017
Graz University of Technology
Ph.D. in Computer Science / 2024
Karlsruhe Institute of Technology
M.Sc. in Electrical Engineering and Information Technology / 2020
National University of Science and Technology
B.Sc. in Electrical Engineering / 2017
Experience
Massachusetts Institute of Technology (MIT)
Postdoctoral Researcher / November, 2024 — December
Leading research on multimodal learning combining speech vision and language for scalable AI systems. Designing and evaluating methods to improve fine-grained reasoning in large language and vision-language models.
Graz University of Technology
Computer Vision Project Assistant / January, 2021 — October, 2024
Developed self-supervised and unsupervised learning techniques to improve neural network robustness to distribution shifts at test time. Conducted extensive research on LLMs and multimodal VLMs resulting in multiple publications at NeurIPS ICCV and CVPR.
Sony AI
Research Scientist Intern / May, 2024 — August, 2024
Designed multimodal learning methods integrating vision audio and language signals. Prototyped and evaluated models for cross-modal understanding in real-world scenarios.
Intel Labs
Master Thesis Researcher / January, 2020 — July, 2020
Evaluated robustness of state-of-the-art 2D and 3D object detectors for autonomous driving under adverse weather.
C++ Developer Intern / October, 2019 — December, 2019
Implemented state estimation using Unscented Kalman Filter in C++ and OpenCV for real-time object tracking.
Intel
Platform Application Engineer Intern / March, 2019 — August, 2019
Built an automation framework including PCB design and microcontroller integration to streamline internal workflows.
Join Jehanzeb on NotedSource!
Join Now
At NotedSource, we believe that professors, post-docs, scientists and other researchers have deep, untapped knowledge and expertise that can be leveraged to drive innovation within companies. NotedSource is committed to bridging the gap between academia and industry by providing a platform for collaboration with industry and networking with other researchers.
For industry, NotedSource identifies the right academic experts in 24 hours to help organizations build and grow. With a platform of thousands of knowledgeable PhDs, scientists, and industry experts, NotedSource makes connecting and collaborating easy.
For academic researchers such as professors, post-docs, and Ph.D.s, NotedSource provides tools to discover and connect to your colleagues with messaging and news feeds, in addition to the opportunity to be paid for your collaboration with vetted partners.
Expert Institutions
Proudly trusted by