AI & Fundamentals
Memory-augmented Optimizers for Deep Learning and Lifelong Learning - Sarath Chandar, Assistant Professor, Polytechnique Montreal
DATE: Mon, February 26, 2024 - 10:00 am
LOCATION: UBC Vancouver Campus, ICCS X836
In this talk, I will introduce the idea of adding external memory to standard optimization methods to improve their performance in deep learning and lifelong learning. In the first part of this talk, I will focus on general deep learning problems and I will introduce a new family of critical gradient-based optimizers. Such optimizers retain a limited view of their gradient history in their internal memory and scale well to large real-life datasets. Our experiments show that the proposed memory-augmented extensions of standard optimizers enjoy accelerated convergence and improved performance on a majority of computer vision and language tasks that we considered. Then I will introduce the idea of self-exploratory optimizers that can explore the loss landscape by storing critical momenta in the internal memory. Our experiments show that such self-exploratory optimizers can find flatter solutions by exploring the loss landscape.
In the second part of the talk, I will discuss the challenges in optimization for lifelong learning. While existing lifelong learning methods employ the general task-agnostic stochastic gradient descent update rule, we propose a task-aware optimizer that adapts the learning rate based on the relatedness among tasks. We empirically show that our proposed adaptive learning rate not only accounts for catastrophic forgetting but also allows positive backward transfer. We also show that our method performs better than several state-of-the-art methods in lifelong learning on complex datasets with a large number of tasks.
Sarath Chandar is an Assistant Professor at Polytechnique Montreal where he leads the Chandar Research Lab. He is also a core faculty member at Mila, the Quebec AI Institute. Sarath holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning. His research interests include lifelong learning, deep learning, optimization, reinforcement learning, and natural language processing. To promote research in lifelong learning, Sarath created the Conference on Lifelong Learning Agents (CoLLAs) in 2022 and served as a program chair for 2022 and 2023. He received his PhD from the University of Montreal and MS by research from the Indian Institute of Technology Madras.