|
Sangdoo Yun
I am a Research Director at NAVER AI Lab,
working on multimodal foundation models that are efficient, reliable, and practical for real-world AI applications.
My current research focuses on making multimodal intelligence more capable, efficient, and reliable for real-world applications.
Recently, our group has been exploring how multimodal models can better perceive, reason, retrieve, and interact.
This includes work on world modeling (e.g., Seoul World Model),
vision-language alignment (e.g., ProLIP, HYPE, MuCo),
and efficient systems (e.g., Model Stock, PIVOT, KVzip, Fast KVzip).
I've also worked on network architectures (ReXNet, PiT),
training techniques (CutMix, ReLabel, AdamP,
KD), and
robustness (ReBias, Shortcut learning, Model Stock).
I've also contributed to Naver's OCR
(e.g., CRAFT, STR, Donut),
face recognition, and LLMs (e.g., Cream, HCX) products.
Our group's goal is to build efficient and reliable multimodal intelligence for the agentic AI era:
foundation models that can perceive, reason, retrieve, and act in real-world environments,
while remaining practical for large-scale AI services.
I received my MS, and PhD in computer vision at Seoul National University in 2013 and 2017, respectively, under supervision of Prof. Jin Young Choi.
I received my BS from Seoul National University in 2010.
I'm also an adjunct professor at SNU AI Inst. from Sep 2022, continuing my previous position at SNU CSE Dept (Sep 2021 - Aug 2022).
My previous and on-going lectures at SNU are available at [Spring 2022], [Fall 2024], and [Fall 2025].
Email  / 
Google Scholar (20K+ citations)  / 
CV  / 
Github
|
|