BEGIN:VCALENDAR VERSION:2.0 PRODID:-//https://caida.ubc.ca//NONSGML iCalcreator 2.41.92// CALSCALE:GREGORIAN METHOD:PUBLISH UID:32386562-3066-4337-b262-633265366330 X-WR-RELCALID:efc09d74-9c93-479e-a94f-485231ddccde X-WR-TIMEZONE:America/Vancouver X-WR-CALNAME:Rethinking the Objective for Policy Optimization in Reinforcem ent Learning - Martha White\, Associate Professor\, University of Alberta BEGIN:VTIMEZONE TZID:America/Vancouver TZUNTIL:20220313T100000Z BEGIN:STANDARD TZNAME:PST DTSTART:20191103T020000 TZOFFSETFROM:-0700 TZOFFSETTO:-0800 RDATE:20201101T020000 RDATE:20211107T020000 END:STANDARD BEGIN:DAYLIGHT TZNAME:PDT DTSTART:20200308T020000 TZOFFSETFROM:-0800 TZOFFSETTO:-0700 RDATE:20210314T020000 END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ddcb56ed-7388-474a-9439-babee0c03068 DTSTAMP:20251220T054739Z CLASS:PUBLIC CREATED:20200529T190730Z DESCRIPTION:Please register for this event here Abstract: The goal in reinf orcement learning is to obtain a policy that maximizes long-term reward. P olicy optimization in reinforcement learning involves directly estimating a parameterized policy\, that maps states to probabilities over actions. T ypically\, these algorithms are built on the policy gradient theorem\, whi ch provides a simple form for the gradient of the policy optimization obje ctive. In practice\, however\, a key weighting in the gradient is dropped for convenience\; despite this omission\, these widely used algorithms see m to perform quite well… DTSTART;TZID=America/Vancouver:20200615T153000 DTEND;TZID=America/Vancouver:20200615T163000 LAST-MODIFIED:20210610T230539Z LOCATION:Please register to receive the Zoom link SUMMARY:Rethinking the Objective for Policy Optimization in Reinforcement L earning - Martha White\, Associate Professor\, University of Alberta TRANSP:OPAQUE URL:https://caida.ubc.ca/event/rethinking-objective-policy-optimization-rei nforcement-learning-martha-white-associate END:VEVENT END:VCALENDAR