Telegram Group & Telegram Channel
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📌 Title:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization

🎙️ Presenter: Amir Kasaei

🧠 Abstract:
This work presents an in-depth analysis of the causal structure in the text encoder of text-to-image (T2I) diffusion models, highlighting its role in introducing information bias and loss. While prior research has mainly addressed these issues during the denoising stage, this study focuses on the underexplored contribution of text embeddings—particularly in multi-object generation scenarios. The authors investigate how text embeddings influence the final image output and why models often favor the first-mentioned object, leading to imbalanced representations. To mitigate this, they propose a training-free text embedding balance optimization method that improves information balance in Stable Diffusion by 125.42%. Additionally, a new automatic evaluation metric is introduced, offering a more accurate assessment of information loss with an 81% concordance rate with human evaluations. This metric better captures object presence and accuracy compared to existing measures like CLIP-based text-image similarity scores.

📄 Paper:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization

Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️



tg-me.com/RIMLLab/211
Create:
Last Update:

💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📌 Title:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization

🎙️ Presenter: Amir Kasaei

🧠 Abstract:
This work presents an in-depth analysis of the causal structure in the text encoder of text-to-image (T2I) diffusion models, highlighting its role in introducing information bias and loss. While prior research has mainly addressed these issues during the denoising stage, this study focuses on the underexplored contribution of text embeddings—particularly in multi-object generation scenarios. The authors investigate how text embeddings influence the final image output and why models often favor the first-mentioned object, leading to imbalanced representations. To mitigate this, they propose a training-free text embedding balance optimization method that improves information balance in Stable Diffusion by 125.42%. Additionally, a new automatic evaluation metric is introduced, offering a more accurate assessment of information loss with an 81% concordance rate with human evaluations. This metric better captures object presence and accuracy compared to existing measures like CLIP-based text-image similarity scores.

📄 Paper:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization

Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️

BY RIML Lab




Share with your friend now:
tg-me.com/RIMLLab/211

View MORE
Open in Telegram


telegram Telegram | DID YOU KNOW?

Date: |

How Does Bitcoin Work?

Bitcoin is built on a distributed digital record called a blockchain. As the name implies, blockchain is a linked body of data, made up of units called blocks that contain information about each and every transaction, including date and time, total value, buyer and seller, and a unique identifying code for each exchange. Entries are strung together in chronological order, creating a digital chain of blocks. “Once a block is added to the blockchain, it becomes accessible to anyone who wishes to view it, acting as a public ledger of cryptocurrency transactions,” says Stacey Harris, consultant for Pelicoin, a network of cryptocurrency ATMs. Blockchain is decentralized, which means it’s not controlled by any one organization. “It’s like a Google Doc that anyone can work on,” says Buchi Okoro, CEO and co-founder of African cryptocurrency exchange Quidax. “Nobody owns it, but anyone who has a link can contribute to it. And as different people update it, your copy also gets updated.”

telegram from ca


Telegram RIML Lab
FROM USA