Tushar Krishna is an Associate Professor in the School of Electrical and Computer Engineering (ECE) at Georgia Institute of Technology, with a courtesy appointment in Computer Science. He held the ON Semiconductor (Endowed) Junior Professorship in ECE at Georgia Tech from 2019-2021. He has also been a visiting professor at MIT EECS + CSAIL, Harvard University CS and a researcher at Intel’s VSSAD group. He has a Ph.D. in Electrical Engineering and Computer Science from MIT (2014), a M.S.E in Electrical Engineering from Princeton University (2009), and a B.Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Delhi (2007). Dr. Krishna’s research spans computer architecture, interconnection networks, networks-on-chip (NoC), and AI/ML accelerator systems – with a focus on optimizing data movement in modern computing platforms. His research is funded via multiple awards from NSF, DARPA, IARPA, SRC (including JUMP2.0), Department of Energy, Intel, Google, Meta/Facebook, Qualcomm and TSMC. His papers have been cited over 18,000 times. Three of his papers have been selected for IEEE Micro’s Top Picks from Computer Architecture, one more received an honorable mention, and four have won best paper awards. Dr. Krishna was inducted into the HPCA Hall of Fame in 2022. At Georgia Tech, he has been honored by the “Class of 1940 Course Survey Teaching Effectiveness Award” in 2018, the “Roger P. Webb Outstanding Junior Faculty Award” from the School of ECE in 2021, the “Richard M. Bass/Eta Kappa Nu Outstanding Junior Teacher Award” in 2023, and the “Roger P. Webb Outstanding Mid-career Faculty Award” from the School of ECE in 2024. Dr. Krishna currently serves as an Associate Director for the Center for Research into Novel Computing Hierarchies (CRNCH) – a cross-disciplinary research center at Georgia Tech. He is also a co-chair of the Chakra Execution Traces and Benchmarks Working group within ML Commons.

Dr. Tushar Krishna

Associate Professor, Georgia Institute of Technology. Expert in AI Hardware and distributed AI systems, interconnection networks, and computer architecture.

Research Expertise

Computer Architecture

Interconnection Networks

Network-on-Chip

Deep Learning Accelerators

About

Publications

The gem5 simulator

ACM SIGARCH Computer Architecture News

2011

14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks

2016 IEEE International Solid-State Circuits Conference (ISSCC)

2016

GARNET: A detailed on-chip network model inside a full-system simulator

2009 IEEE International Symposium on Performance Analysis of Systems and Software

SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training

2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)

2020

Understanding Reuse, Performance, and Hardware Cost of DNN Dataflow

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture

2019

Comment on the Paper Titled ’The Origin of Quantum Mechanical Statistics: Insights from Research on Human Language’ (arXiv preprint arXiv:2407.14924, 2024)

Unknown Venue

2024

A Systematic Methodology for Characterizing Scalability of DNN Accelerators using SCALE-Sim

2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2020

MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings

IEEE Micro

2020

SCORPIO

ACM SIGARCH Computer Architecture News

2014

On-Chip Networks

Synthesis Lectures on Computer Architecture

2017

GAMMA

Proceedings of the 39th International Conference on Computer-Aided Design

2020

Heterogeneous Dataflow Accelerators for Multi-DNN Workloads

2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)

2021

Breaking the on-chip latency barrier using SMART

2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)

2013

Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

2020 57th ACM/IEEE Design Automation Conference (DAC)

2020

Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices

2019 IEEE International Symposium on Workload Characterization (IISWC)

2019

SMART: A Single-Cycle Reconfigurable NoC for SoC Applications

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013

2013

ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning

2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

2020

Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture

2011

Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI

Proceedings of the 49th Annual Design Automation Conference

2012

NoC with Near-Ideal Express Virtual Channels Using Global-Line Communication

2008 16th IEEE Symposium on High Performance Interconnects

2008

Rethinking NoCs for Spatial Neural Network Accelerators

Proceedings of the Eleventh IEEE/ACM International Symposium on Networks-on-Chip

2017

Architecture, Chip, and Package Codesign Flow for Interposer-Based 2.5-D Chiplet Integration Enabling Heterogeneous IP Reuse

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

2020

Smart: Single-Cycle Multihop Traversals over a Shared Network on Chip

IEEE Micro

2014

Kite: A Family of Heterogeneous Interposer Topologies Enabled via Accurate Interconnect Modeling

2020 57th ACM/IEEE Design Automation Conference (DAC)

2020

SWIFT: A SWing-reduced interconnect for a Token-based Network-on-Chip in 90nm CMOS

2010 IEEE International Conference on Computer Design

2010

STONNE: Enabling Cycle-Level Microarchitectural Simulation for DNN Inference Accelerators

2021 IEEE International Symposium on Workload Characterization (IISWC)

2021

LATR

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

2018

ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator

2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)

2020

Architecture, Chip, and Package Co-design Flow for 2.5D IC Design Enabling Heterogeneous IP Reuse

Proceedings of the 56th Annual Design Automation Conference 2019

2019

ASTRA-SIM: Enabling SW/HW Co-Design Exploration for Distributed DL Training Platforms

2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2020

OpenSMART: Single-cycle multi-hop NoC generator in BSV and Chisel

2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2017

Sample US Patent

Commercializing Innovation

2015

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference

Findings of the Association for Computational Linguistics ACL 2024

2024

Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators

ACM Transactions on Architecture and Code Optimization

2021

FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2

2023

Efficient Control and Communication Paradigms for Coarse-Grained Spatial Architectures

ACM Transactions on Computer Systems

2015

SWIFT: A Low-Power Network-On-Chip Implementing the Token Flow Control Router Architecture With Swing-Reduced Interconnects

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

2013

MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores

2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)

2022

Understanding the Design-Space of Sparse/Dense Multiphase GNN dataflows on Spatial Accelerators

2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

2022

Synchronized Progress in Interconnection Networks (SPIN): A New Theory for Deadlock Freedom

2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)

2018

Enabling Compute-Communication Overlap in Distributed Deep Learning Training Platforms

2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)

2021

Express Virtual Channels with Capacitively Driven Global Links

IEEE Micro

2009

Flexagon: A Multi-dataflow Sparse-Sparse Matrix Multiplication Accelerator for Efficient DNN Processing

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3

2023

mRNA: Enabling Efficient Mapping Space Exploration for a Reconfiguration Neural Accelerator

2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2019

Architecting a Secure Wireless Network-on-Chip

2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS)

2018

Single-cycle collective communication over a shared network fabric

2014 Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS)

2014

Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube

2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2018

DRAIN: Deadlock Removal for Arbitrary Irregular Networks

2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)

2020

Single-Cycle Multihop Asynchronous Repeated Traversal: A SMART Future for Reconfigurable On-Chip Networks

Computer

2013

SEESAW: Using Superpages to Improve VIPT Caches

2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)

2018

Static Bubble: A Framework for Deadlock-Free Irregular On-chip Topologies

2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)

2017

A Proposed Meta-Reality Immersive Development Pipeline: Generative AI Models and Extended Reality (XR) Content for the Metaverse

Journal of Intelligent Learning Systems and Applications

2023

DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators

2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)

2022

SWAP

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture

2019

ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale

2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2023

Themis

Proceedings of the 49th Annual International Symposium on Computer Architecture

2022

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

IEEE Transactions on Parallel and Distributed Systems

2022

Scaling the Cascades: Interconnect-Aware FPGA Implementation of Machine Learning Problems

2019 29th International Conference on Field Programmable Logic and Applications (FPL)

2019

Scalable Distributed Last-Level TLBs Using Low-Latency Interconnects

2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

2018

Extending Sparse Tensor Accelerators to Support Multiple Compression Formats

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

2021

Pitstop: Enabling a Virtual Network Free Network-on-Chip

2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)

2021

GeneSys: Enabling Continuous Learning through Neural Network Evolution in Hardware

2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

2018

Physical vs. Virtual Express Topologies with Low-Swing Links for Future Many-Core NoCs

2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip

2010

Data Orchestration in Deep Learning Accelerators

Synthesis Lectures on Computer Architecture

2020

Flexion: A Quantitative Metric for Flexibility in DNN Accelerators

IEEE Computer Architecture Letters

2021

A novel network fabric for efficient spatio-temporal reduction in flexible DNN accelerators

Proceedings of the 15th IEEE/ACM International Symposium on Networks-on-Chip

2021

VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs

2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)

2023

Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators

2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)

2021

A low-swing crossbar and link generator for low-power networks-on-chip

2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

2011

Towards Cognitive AI Systems: Workload and Characterization of Neuro-Symbolic AI

2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

2024

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU

2021 58th ACM/IEEE Design Automation Conference (DAC)

2021

Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package

Proceedings of the 26th Asia and South Pacific Design Automation Conference

2021

BINDU

Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip

2019

Brownian Bubble Router: Enabling Deadlock Freedom via Guaranteed Forward Progress

2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS)

2018

Reinforcement learning based interconnection routing for adaptive traffic optimization

Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip

2019

Texture Filter Memory — a power-efficient and scalable texture memory architecture for mobile graphics processors

2008 IEEE/ACM International Conference on Computer-Aided Design

2008

Demystifying Map Space Exploration for NPUs

2022 IEEE International Symposium on Workload Characterization (IISWC)

2022

Self adaptive reconfigurable arrays (SARA)

Proceedings of the 59th ACM/IEEE Design Automation Conference

2022

A Communication-Centric Approach for Designing Flexible DNN Accelerators

IEEE Micro

2018

Architecture, Dataflow and Physical Design Implications of 3D-ICs for DNN-Accelerators

2021 22nd International Symposium on Quality Electronic Design (ISQED)

2021

Optimizing the data placement and transformation for multi-bank CGRA computing system

2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)

2018

Understanding the Impact of On-chip Communication on DNN Accelerator Performance

2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS)

2019

FastTrack: Leveraging Heterogeneous FPGA Wires to Design Low-Cost High-Performance Soft NoCs

2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)

2018

Spoofing Prevention via RF Power Profiling in Wireless Network-on-Chip

Proceedings of the 3rd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems

2018

Algorithm-Hardware Co-Design of Distribution-Aware Logarithmic-Posit Encodings for Efficient DNN Inference

Proceedings of the 61st ACM/IEEE Design Automation Conference

2024

Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective

Oncoscience

2022

ECO TLB

ACM Transactions on Architecture and Code Optimization

2020

The first historical account of Vietnam mathematics on arXiv

Unknown Venue

2022

Scalable Distributed Training of Recommendation Models: An ASTRA-SIM + NS3 case-study with TCP/IP transport

2020 IEEE Symposium on High-Performance Interconnects (HOTI)

2020

Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

2014

Stay in your Lane: A NoC with Low-overhead Multi-packet Bypassing

2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)

2022

Scientometric engineering: Exploring citation dynamics via arXiv eprints

Quantitative Science Studies

2022

DUB

Proceedings of the 15th IEEE/ACM International Symposium on Networks-on-Chip

2021

Demo: An Optimization Framework to Select Edge Servers for Automotive Connected Services

2019 IEEE Vehicular Networking Conference (VNC)

2019

Education

Massachusetts Institute of Technology

Ph.D., Electrical Engineering and Computer Science / June, 2014

Cambridge, Massachusetts, United States of America

Princeton University

M.S.E., Electrical Engineering / August, 2009

Princeton, New Jersey, United States of America

Indian Institute of Technology Delhi

B.Tech., Electrical Engineering / August, 2007

New Delhi

Experience

Georgia Institute of Technology

Associate Professor (with tenure) / August, 2015 — Present

Massachusetts Institute of Technology

Visiting Associate Professor / July, 2023 — July, 2024

Harvard University

Visiting Researcher / July, 2024 — Present

Intel

Design Engineer / November, 2013 — December, 2014

Links & Social Media

Research Web Site

Join Tushar on NotedSource!

Join Now

At NotedSource, we believe that professors, post-docs, scientists and other researchers have deep, untapped knowledge and expertise that can be leveraged to drive innovation within companies. NotedSource is committed to bridging the gap between academia and industry by providing a platform for collaboration with industry and networking with other researchers.

For industry, NotedSource identifies the right academic experts in 24 hours to help organizations build and grow. With a platform of thousands of knowledgeable PhDs, scientists, and industry experts, NotedSource makes connecting and collaborating easy.

For academic researchers such as professors, post-docs, and Ph.D.s, NotedSource provides tools to discover and connect to your colleagues with messaging and news feeds, in addition to the opportunity to be paid for your collaboration with vetted partners.