AI & Fundamentals
CAIDA AI-Technical Seminar - The Merits of Models in Continuous Reinforcement Learning, Benjamin Recht (University of California, Berkeley)
DATE: Thu, May 9, 2019 - 2:00 pm
LOCATION: X836 - 2366 Main Mall, V6T 1Z4 (Computer Science Building)
Classical control theory and machine learning have similar goals: acquire data about the environment, perform a prediction, and use that prediction to positively impact the world. However, the approaches they use are frequently at odds. Controls is the theory of designing complex actions from well-specified models, while machine learning makes intricate, model-free predictions from data alone. For contemporary autonomous systems, some sort of hybrid may be essential in order to fuse and process the vast amounts of sensor data recorded into timely, agile, and safe decisions. In this talk, I will examine the relative merits of model-based and model-free methods in data-driven control problems. I will discuss quantitative estimates on the number of measurements required to achieve a high quality control performance and statistical techniques that can distinguish the relative power of different methods. In particular, I will show how model-free methods are considerably less sample efficient than their model-based counterparts. I will also describe how notions of robustness, safety, constraint satisfaction, and exploration can be transparently incorporated in model-based methods. I will conclude with a discussion of possible positive roles for model-free methods in contemporary autonomous systems that may mitigate their high sample complexity and lack of reliability and versatility.
Benjamin Recht is an Associate Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. Ben's research group studies the theory and practice of optimization algorithms with a particular focus on applications in machine learning and control. Ben is the recipient of a Presidential Early Career Award for Scientists and Engineers, an Alfred P. Sloan Research Fellowship, the 2012 SIAM/MOS Lagrange Prize in Continuous Optimization, the 2014 Jamon Prize, the 2015 William O. Baker Award for Initiatives in Research, and the 2017 NIPS Test of Time Award.