💠 Compositional Learning Journal ClubJoin us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges

RIML Lab

💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

✅ This Week's Presentation:

🔹 Title: Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control

🔸 Presenter: Arshia Hemmat

🌀 Abstract:
This presentation introduces advancements in addressing compositional challenges in text-to-image (T2I) generation models. Current diffusion models often struggle to associate attributes accurately with the intended objects based on text prompts. To address this, a new Edge Prediction Vision Transformer (EPViT) is introduced for improved image-text alignment evaluation. Additionally, the proposed Focused Cross-Attention (FCA) mechanism uses syntactic constraints from input sentences to enhance visual attention maps. DisCLIP embeddings further disentangle multimodal embeddings, improving attribute-object alignment. These innovations integrate seamlessly into state-of-the-art diffusion models, enhancing T2I generation quality without additional model training.

📄 Paper: Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control

Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️

arXiv.org

Object-Attribute Binding in Text-to-Image Generation: Evaluation...

Current diffusion models create photorealistic images given a text prompt as input but struggle to correctly bind attributes mentioned in the text to the right objects in the image. This is...

www.tg-me.com/tw/telegram/com.RIMLLab/144

2.6K viewsAmir Kasaei, Nov 9, 2024 at 16:32