💠 Compositional Learning Journal ClubJoin us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges

RIML Lab

💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

✅ This Week's Presentation:

🔹 Title: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

🔸 Presenter: Amir Kasaei

🌀 Abstract:
This paper explores the use of Chain-of-Thought (CoT) reasoning to improve autoregressive image generation, an area not widely studied. The authors propose three techniques: scaling computation for verification, aligning preferences with Direct Preference Optimization (DPO), and integrating these methods for enhanced performance. They introduce two new reward models, PARM and PARM++, which adaptively assess and correct image generations. Their approach improves the Show-o model, achieving a +24% gain on the GenEval benchmark and surpassing Stable Diffusion 3 by +15%.

📄 Papers: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:30 - 6:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️

arXiv.org

Can We Generate Images with CoT? Let's Verify and Reinforce...

Chain-of-Thought (CoT) reasoning has been extensively explored in large models to tackle complex understanding tasks. However, it still remains an open question whether such strategies can be...

www.tg-me.com/br/telegram/com.RIMLLab/151

4.2K viewsAmir Kasaei, edited Jan 26 at 06:47