Based on the animation, I personally don't expect this to be very helpful. The m...

prophesi · 2025-03-14T16:27:06 1741969626

Could you expound on this? From what I'm reading, this sounds like an issue with diffusion models that their block diffusion model is purposefully designed to mitigate, by conditioning on previous blocks and allowing for larger blocks if that conditioning still doesn't help maintain coherence.

85392_school · 2025-03-14T17:05:57 1741971957

It's an issue that you run into as long as you're forced to start with a yes/no answer. It's a problem forward-only LLMs have and diffusion models don't, and normal block diffusion is closer to forward LLMs than diffusion models.

You could increase the block size to act more like a full diffusion model, but you would lose some of the benefits of block diffusion.

throwaway314155 · 2025-03-14T18:52:46 1741978366

Interesting. Makes me want to play around with an open diffusion LM. Do you have any recommendations?

jasonjmcghee · 2025-03-14T16:34:06 1741970046

My understanding here is block size can be arbitrarily large, under similar constraints as diffusion models. Is that not the case?