About me

Hi, I am a PhD student in Stochastic Systems and Learning Laboratory (S2L2), under Prof. Rahul Jain. I am interested in reinforcement learning, online learning, stochastic control and high-dimensional statistics. During my PhD, I have been working primarily on approximate dynamic programming for reinforcement learning. My focus was to design interpretable (not black-box) learning algorithms. I designed dynamic programming based algorithms in conjunction with random feature kernel methods. I tested these algorithms on OpenAI gym simulator, locomotion tasks in MuJoCo and quadrupedal bot movement in PyBullet. We also analyzed these algorithms to provide performance guarantees in probability. Before coming to USC, I had a great time at IIT Bombay.

Publications

W. Haskell, R. Jain H. Sharma and Pengqian Yu, “An Empirical Dynamic Programming Algorithm for Continuous MDPs” , IEEE Transactions on Automatic Control (TAC) 2020
H.Sharma, R. Jain, “A Reinforcement Learning Algorithm for Continuous State MDPs with Finite Time Guarantees”, L4DC 2020 (submitted)
C. Wei, M. Jafarnia-Jahromi, H. Luo, H. Sharma, R. Jain, “Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes”, submitted
H.Sharma, R. Jain, “An approximately optimal algorithm for relative value learning for averaged MDPs with continuous states and actions”, Allerton 2019
H.Sharma, R. Jain, W. Haskell, “Empirical Algorithms for Stochastic Systems with Continuous States and Actions”, IEEE Control and Decision Conf. (CDC) 2019
H.Sharma, M. Jahromi-Jafarnia, R. Jain, “Approximate Relative Value Learning for Average-reward Continuous State MDPs”, Uncertainty in Artificial Intelligence (UAI)2019
H. Sharma, A. Gupta and R. Jain, “An Empirical Relative Value Learning Algorithm for Non-parametric MDPs with Continuous State Space”, European Control Conf. 2019
H. Sharma and Y. Sun, “ Approximate Large-Scale Kernel Reinforcement Learning” Technical report
W. Haskell, P. Yu, H. Sharma and R. Jain , “Randomized Function Fitting-based Empirical Value Iteration” IEEE Control and Decision Conf. (CDC) 2017
W. Haskell, R. Jain and H. Sharma, “A dynamical systems framework for stochastic iterative optimization”, IEEE Control and Decision Conf. (CDC) 2016
H. Sharma, A. Patel, S. N. Merchant and U. B. Desai, “Optimal Spectrum Sensing for Cognitive Radio with Imperfect Detector”, IEEE VTC, 2014
H. Sharma and R. Jain , “A Randomized RL Algorithm for MDPs with continuous states and actions”, IEEE Transactions on Automatic Control (TAC) (to be submitted)
H, Sharma, R. Yin, R. Jain “Empirical Policy Learning in Continuous MDPs”, in progress
H.Sharma, Y. Abbasi, G. Theocharous and Z. Wen, “Exploration in Contextual MDPs:An UCRL approach”, in progress

Research Internships

Adobe Research, Summer 2018
Technicolor AI Labs, Summer 2017

Interesting Tidbit: Many people think that my name is Japanese like a combination of Hitachi and Mitsubhishi but it is actually a sanskrit word which means well-wisher.

Hiteshi Sharma

Publications

Research Internships