Telegram Group & Telegram Channel
๐Ÿ’  Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

โœ… This Week's Presentation:

๐Ÿ”น Title: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback

๐Ÿ”ธ Presenter: Amir Kasaei

๐ŸŒ€ Abstract:
Recent advancements in text-conditioned image generation, particularly through latent diffusion models, have achieved significant progress. However, as text complexity increases, these models often struggle to accurately capture the semantics of prompts, and existing tools like CLIP frequently fail to detect these misalignments.

This presentation introduces a Decompositional-Alignment-Score, which breaks down complex prompts into individual assertions and evaluates their alignment with generated images using a visual question answering (VQA) model. These scores are then combined to produce a final alignment score. Experimental results show this method aligns better with human judgments compared to traditional CLIP and BLIP scores. Moreover, it enables an iterative process that improves text-to-image alignment by 8.7% over previous methods.

This approach not only enhances evaluation but also provides actionable feedback for generating more accurate images from complex textual inputs.

๐Ÿ“„ Paper: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback


Session Details:
- ๐Ÿ“… Date: Sunday
- ๐Ÿ•’ Time: 2:00 - 3:00 PM
- ๐ŸŒ Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! โœŒ๏ธ



tg-me.com/RIMLLab/133
Create:
Last Update:

๐Ÿ’  Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

โœ… This Week's Presentation:

๐Ÿ”น Title: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback

๐Ÿ”ธ Presenter: Amir Kasaei

๐ŸŒ€ Abstract:
Recent advancements in text-conditioned image generation, particularly through latent diffusion models, have achieved significant progress. However, as text complexity increases, these models often struggle to accurately capture the semantics of prompts, and existing tools like CLIP frequently fail to detect these misalignments.

This presentation introduces a Decompositional-Alignment-Score, which breaks down complex prompts into individual assertions and evaluates their alignment with generated images using a visual question answering (VQA) model. These scores are then combined to produce a final alignment score. Experimental results show this method aligns better with human judgments compared to traditional CLIP and BLIP scores. Moreover, it enables an iterative process that improves text-to-image alignment by 8.7% over previous methods.

This approach not only enhances evaluation but also provides actionable feedback for generating more accurate images from complex textual inputs.

๐Ÿ“„ Paper: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback


Session Details:
- ๐Ÿ“… Date: Sunday
- ๐Ÿ•’ Time: 2:00 - 3:00 PM
- ๐ŸŒ Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! โœŒ๏ธ

BY RIML Lab


Warning: Undefined variable $i in /var/www/tg-me/post.php on line 283

Share with your friend now:
tg-me.com/RIMLLab/133

View MORE
Open in Telegram


RIML Lab Telegram | DID YOU KNOW?

Date: |

Launched in 2013, Telegram allows users to broadcast messages to a following via โ€œchannelsโ€, or create public and private groups that are simple for others to access. Users can also send and receive large data files, including text and zip files, directly via the app.The platform said it has more than 500m active users, and topped 1bn downloads in August, according to data from SensorTower.

If riding a bucking bronco is your idea of fun, youโ€™re going to love what the stock market has in store. Consider this past weekโ€™s ride a preview.The weekโ€™s action didnโ€™t look like much, if you didnโ€™t know better. The Dow Jones Industrial Average rose 213.12 points or 0.6%, while the S&P 500 advanced 0.5%, and the Nasdaq Composite ended little changed.

RIML Lab from jp


Telegram RIML Lab
FROM USA