tg-me.com/RIMLLab/153
Last Update:
๐ Compositional Learning Journal Club
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
โ
This Week's Presentation:
๐น Title: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
๐ธ Presenter: Amir Kasaei
๐ Abstract:
This paper explores the use of Chain-of-Thought (CoT) reasoning to improve autoregressive image generation, an area not widely studied. The authors propose three techniques: scaling computation for verification, aligning preferences with Direct Preference Optimization (DPO), and integrating these methods for enhanced performance. They introduce two new reward models, PARM and PARM++, which adaptively assess and correct image generations. Their approach improves the Show-o model, achieving a +24% gain on the GenEval benchmark and surpassing Stable Diffusion 3 by +15%.
๐ Papers: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Session Details:
- ๐
Date: Wednesday
- ๐ Time: 2:15 - 3:15 PM
- ๐ Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! โ๏ธ
BY RIML Lab

Share with your friend now:
tg-me.com/RIMLLab/153