Signal Processing Society
Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP) in a model-free fashion, lies at the heart of reinforcement learning practices. However, theoretical understandings on its non-asymptotic sample complexity remain unsatisfactory, despite significant recent efforts. In this talk, we first show a tight sample complexity bound of Q-learning in the single-agent setting, together with a matching lower bound to establish its minimax sub-optimality. We then show how federated versions of Q-learning allow collaborative learning using data collected by multiple agents without central sharing, where an importance averaging scheme is introduced to unveil the blessing of heterogeneity.
Speaker: Dr. Yuejie Chi
Dr. Yuejie Chi is the Sense of Wonder Group Endowed Professor of Electrical and Computer Engineering in AI Systems at Carnegie Mellon University, with courtesy appointments in the Machine Learning department and CyLab. She received her Ph.D. and M.A. from Princeton University, and B. Eng. (Hon.) from Tsinghua University, all in Electrical Engineering. Her research interests lie in the theoretical and algorithmic foundations of data science, signal processing, machine learning and inverse problems, with applications in sensing, imaging, decision making, and societal systems, broadly defined. Among others, Dr. Chi received the Presidential Early Career Award for Scientists and Engineers (PECASE) and the inaugural IEEE Signal Processing Society Early Career Technical Achievement Award for contributions to high-dimensional structured signal processing. She is an IEEE Fellow (Class of 2023) for contributions to statistical signal processing with low-dimensional structures.