BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//https://caida.ubc.ca//NONSGML iCalcreator 2.41.92//
CALSCALE:GREGORIAN
METHOD:PUBLISH
UID:38356333-3532-4362-a232-396530346533
X-WR-RELCALID:efc09d74-9c93-479e-a94f-485231ddccde
X-WR-TIMEZONE:America/Vancouver
X-WR-CALNAME:Optimal Quantization for Matrix Multiplication - Or Ordentlich
 \, Associate Professor\, Hebrew University of Jerusalem
BEGIN:VTIMEZONE
TZID:America/Vancouver
TZUNTIL:20270314T100000Z
BEGIN:STANDARD
TZNAME:PST
DTSTART:20241103T020000
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
RDATE:20251102T020000
RDATE:20261101T020000
END:STANDARD
BEGIN:DAYLIGHT
TZNAME:PDT
DTSTART:20250309T020000
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
RDATE:20260308T020000
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:5b963de5-085c-4801-9af0-3331f20fa6ca
DTSTAMP:20260430T161252Z
CLASS:PUBLIC
CREATED:20250827T211606Z
DESCRIPTION:Abstract: The main building block of large language models is m
 atrix multiplication\, which is often bottlenecked by the speed of loading
  these matrices from memory. A possible solution is to trade accuracy for 
 speed by storing the matrices in low precision (“quantizing” them). In rec
 ent years a number of quantization algorithms with increasingly better per
 formance were proposed (e.g.\, SmoothQuant\, Brain compression\, GPTQ\, Qu
 IP\, QuIP#\, QuaRot\, SpinQuant). In this work\, we prove an information t
 heoretic lower bound on achievable accuracy of computing matrix product as
  a function of compression…
DTSTART;TZID=America/Vancouver:20250910T150000
DTEND;TZID=America/Vancouver:20250910T160000
LAST-MODIFIED:20250827T212108Z
LOCATION:UBC Vancouver Campus\, ICCS 288
SUMMARY:Optimal Quantization for Matrix Multiplication - Or Ordentlich\, As
 sociate Professor\, Hebrew University of Jerusalem
TRANSP:OPAQUE
URL:https://caida.ubc.ca/event/optimal-quantization-matrix-multiplication-o
 r-ordentlich-associate-professor-hebrew
END:VEVENT
END:VCALENDAR
