Projects

Protopia AI

Protopia generates curated noise around the source signal, safeguarding data from unauthorized inference.

Noise for Privacy-Aware AI

Demo video for our technology on Face Detection task
  • Development of Protopia AI’s noise training technology for inference privacy as well as PoC on various tasks (2020.10 – present) NeurIPS-D’21

University of California, San Diego

From Statement of Purpose for admission to UCSD

My Ph.D will be an adventure about various ways to optimize systems 🙂
Following are the projects that I work on

Machine Learning for Compiler Optimization

Overview of the Adaptive Code Optimization for Expedited Deep Neural Network Compilation (ICLR’20)
  • Reinforcement Learning and Adaptive Sampling for Optimized DNN Compilation (2019.02 – present) ICLR’20, ICML-W’19

Spatial Multi-Tenancy for DNN Accelerators

Illustration of Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of DNNs (MICRO’20)
  • Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of DNNs (2020.02 – present) MICRO’20

Others

  • Neural Architecture Search on Resource-limited Platforms IEEE SMC’21 – with Mälardalen University in Sweden

Qualcomm AI Research

During my internship at Qualcomm, I worked on developing compiler optimization techniques to reduce memory footprint while executing deep neural networks.

Compiler Optimization for Machine Learning

Overview of the Memory-Aware Scheduling of Irregularly Wired Neural Networks (MLSys’20)
  • Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices (2019.07-2019.09) MLSys’20

Samsung Research

For three and half years, I was an engineer at Samsung Research, where I worked on running various applications on embedded devices. I tackled the problem from various levels of the system stack.

Samsung Reconfigurable Processor (SRP)

  • Developed swing modulo scheduler for VLIW and achieved maximum of 250% performance improvement
  • Accelerated scheduling time and increased performance of edge-centric modulo scheduler for CGRA using an adaptive algorithm to optimize routing of the scheduling process. As a result, achieved smaller Initiation Intervals (II) for many kernels, significantly faster
  • Ported LLVM compiler’s backends and C standard library for multiple architectures
  • Implemented and debugged various tools to enhance usability of the toolchain

* Samsung Reconfigurable Processor is a VLIW plus CGRA architecture DSP, which has been employed in Samsung’s TVs, Galaxy Smartphones, and various other products

Samsung Neural Processor (SNP)

  • Developed SNP system simulator to be used for architecture exploration, hardware verification, and software development
  • Ported, quantized, and optimized neural network algorithms including LeNet, SqueezeNet, and other applications to run on SNP

* Samsung Neural Processor is a Neural Network Processor, which is targeted for Samsung’s future products including TVs and Smartphones