AI & Fundamentals
Reinforcement Learning With Constraints: From Theory to Reasoning in LLM - Lin Yang, Assistant Professor, UCLA
DATE: Tue, July 15, 2025 - 2:45 pm
LOCATION: UBC Vancouver Campus, Fried Kaiser (KAIS) building, Room 2020/2030, 2332 Main Mall
DETAILS
Abstract:
In this talk, I will explore reinforcement learning with constraints, focusing on both theoretical foundations and practical applications. I will first present recent advances in the sample complexity of constrained Markov decision processes (CMDPs), covering both offline and online settings. Our results establish near-optimal upper and lower bounds under relaxed and strict feasibility regimes, revealing that constraint satisfaction—while generally harder—can match the sample efficiency of unconstrained MDPs under certain conditions. These insights are grounded in primal-dual algorithms and generative model frameworks. Inspired by this theory, I will discuss how CMDPs can be applied to impose behavior in large language models (LLMs), such as controlling reasoning length or enforcing budgeted constraints during fine-tuning. By treating response generation as a CMDP and incorporating online dual updates, we show that LLMs can be optimized to meet constraints with minimal degradation in performance.
This talk is a part of a full day event. Please see the event page for the full schedule.