I currently work full-time at Qualcomm Inc., Raleigh, NC.
I graduated in December 2012 with a Ph.D. from North Carolina State University. My advisor was Dr. Eric Rotenberg. Before joining NCSU, I studied at Temple Univeristy, Philadelphia, where I earned my Master's degree in Electrical Engineering. My advisor was Dr. Musoke Sendaula. For my Master's thesis, I had the opportunity to work with Dr. Amir Roth, at the University of Pennsylvania, Philadelphia. Prior to that, I completed my Bachelor of Engineering degree in Electronics Engineering from the University of Mumbai, India.
In my free time, I enjoy photography and playing the guitar.
Senior Engineer, Qualcomm Inc., Raleigh, NC.
Jun 2013 - present.
+ Performance modeling lead for L2 cache unit and processor-system interface.
+ Perform model vs. RTL correlation to ensure cycle-level accuracy of the performance model.
+ Experiment with microarchitectural ideas to improve performance of next-generation CPUs.
+ Analyze application behavior, identify performance bottlenecks, and recommend microarchitectural solutions to improve the performance of future CPUs.
Performance Tools Intern, Intel Corp., Santa Clara, CA.
Feb 2011 - Aug 2011.
+ Ported SEP, a performance profiling tool used in Intel® VTune™, to enable performance monitoring on heterogeneous processor platforms.
+ Augmented SEP to allow independent monitoring of events on different processor types.
+ Analyzed the performance characteristics of SPEC CPU benchmarks using SEP on different types of processors.
+ Investigated auto-parallelization of SPEC CPU benchmarks to improve processor performance and utilization on heterogeneous processor platforms.
Heterogeneous multi-core design, novel processor microarchitecture, instruction-level parallelism, microarchitecture simulation tools.
My research involves researching methods to recommend a set of core designs for a heterogeneous multi-core using the FabScalar framework. The constituent cores are chosen to maximize single-thread performance for a wide range of applications, but at the same time, minimize performance degradation due to application diversity and scheduling variability.
I developed a cycle-accurate simulator for a customizable RTL model of a superscalar processor for the FabScalar framework. This tool can provide fast and accurate-to-RTL estimate of performance for studies such as design-space exploration of processors, pre-RTL evaluation of ideas, etc.
- A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors. S. Navada, N. K. Choudhary, S. V. Wadhavkar, and E. Rotenberg. Proceedings of the 22nd international conference on Parallel architectures and compilation techniques (PACT '13)pp.133-144, September 2013. [doi] [pdf]
- FabScalar: Automating Superscalar Core Design. N. K. Choudhary, S. V. Wadhavkar, T. A. Shah, H. Mayukh, J. Gandhi, B. H. Dwiel, S. Navada, H. H. Najaf-abadi, and E. Rotenberg. IEEE Micro, Special Issue: Micro's Top Picks from Computer Architecture Conferences, vol. 32, no. 3, pp.48-59, May-June 2012. [doi]
- FabScalar: Composing Synthesizable RTL Designs of Arbitrary Cores within a Canonical Superscalar Template. N. K. Choudhary, S. V. Wadhavkar, T. A. Shah, H. Mayukh, J. Gandhi, B. H. Dwiel, S. Navada, H. H. Najaf-abadi, and E. Rotenberg. Proceedings of the 38th IEEE/ACM International Symposium on Computer Architecture (ISCA-38), pp. 11-22, June 2011. [pdf]
- FabScalar. Niket K. Choudhary, Salil Wadhavkar, Tanmay Shah, Sandeep Navada, Hashem Hashemi, and Eric Rotenberg. Workshop on Architectural Research Prototyping (WARP), held in conjunction with ISCA-36, June 2009. [pdf]
[pdf] [ ]
Playing the guitar