BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//https://caida.ubc.ca//NONSGML iCalcreator 2.41.92//
CALSCALE:GREGORIAN
METHOD:PUBLISH
UID:64313665-3232-4363-b530-346131366664
X-WR-RELCALID:efc09d74-9c93-479e-a94f-485231ddccde
X-WR-TIMEZONE:America/Vancouver
X-WR-CALNAME:Advancing Multimodal Vision-Language Learning - Aishwarya Agra
 wal\, Assistant Professor\, University of Montreal
BEGIN:VTIMEZONE
TZID:America/Vancouver
TZUNTIL:20261101T090000Z
BEGIN:STANDARD
TZNAME:PST
DTSTART:20241103T020000
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
RDATE:20251102T020000
END:STANDARD
BEGIN:DAYLIGHT
TZNAME:PDT
DTSTART:20240310T020000
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
RDATE:20250309T020000
RDATE:20260308T020000
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:db10b64a-7c9f-419c-b809-bed1f2743528
DTSTAMP:20260306T065734Z
CLASS:PUBLIC
CREATED:20241212T191414Z
DESCRIPTION:Abstract: Over the last decade\, multimodal vision-language (VL
 ) research has seen impressive progress. We can now automatically caption 
 images in natural language\, answer natural language questions about image
 s\, retrieve images using complex natural language queries and even genera
 te images given natural language descriptions. Despite such tremendous pro
 gress\, current VL research faces several challenges that limit the applic
 ability of state-of-art VL systems. Even large VL systems based on multimo
 dal large language models (MLLMs) such as GPT-4V and Gemini struggle with 
 counting objects in…
DTSTART;TZID=America/Vancouver:20241216T134500
DTEND;TZID=America/Vancouver:20241216T144500
LAST-MODIFIED:20241212T191956Z
LOCATION:UBC Vancouver Campus\, ICCS X836
SUMMARY:Advancing Multimodal Vision-Language Learning - Aishwarya Agrawal\,
  Assistant Professor\, University of Montreal
TRANSP:OPAQUE
URL:https://caida.ubc.ca/index.php/event/advancing-multimodal-vision-langua
 ge-learning-aishwarya-agrawal-assistant-professor
END:VEVENT
END:VCALENDAR