BEGIN:VCALENDAR VERSION:2.0 PRODID:-//https://caida.ubc.ca//NONSGML iCalcreator 2.41.92// CALSCALE:GREGORIAN METHOD:PUBLISH UID:65623632-3063-4062-b030-396366383538 X-WR-RELCALID:efc09d74-9c93-479e-a94f-485231ddccde X-WR-TIMEZONE:America/Vancouver X-WR-CALNAME:Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models - Quanquan Gu\, Associate Professor\, UCLA BEGIN:VTIMEZONE TZID:America/Vancouver TZUNTIL:20251102T090000Z BEGIN:STANDARD TZNAME:PST DTSTART:20231105T020000 TZOFFSETFROM:-0700 TZOFFSETTO:-0800 RDATE:20241103T020000 END:STANDARD BEGIN:DAYLIGHT TZNAME:PDT DTSTART:20230312T020000 TZOFFSETFROM:-0800 TZOFFSETTO:-0700 RDATE:20240310T020000 RDATE:20250309T020000 END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:2bc3224f-ea08-445e-a657-5bf620416f49 DTSTAMP:20260327T115315Z CLASS:PUBLIC CREATED:20240209T230336Z DESCRIPTION:Zoom Link Abstract: Harnessing the power of human-annotated dat a through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Lang uage Models (LLMs). In this paper\, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional h uman-annotated data. We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)\, which starts from a supervised fine-tuned model. At t he heart of SPIN lies a self-play mechanism\, where the LLM refines its ca pability by playing against instances of itself. More specifically\, the L LM generates its own… DTSTART;TZID=America/Vancouver:20240227T130000 DTEND;TZID=America/Vancouver:20240227T140000 LAST-MODIFIED:20240221T233645Z LOCATION:UBC Vancouver Campus\, ICCS X836 SUMMARY:Self-Play Fine-Tuning Converts Weak Language Models to Strong Langu age Models - Quanquan Gu\, Associate Professor\, UCLA TRANSP:OPAQUE URL:https://caida.ubc.ca/event/self-play-fine-tuning-converts-weak-language -models-strong-language-models-quanquan-gu END:VEVENT END:VCALENDAR