Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Based on the animation, I personally don't expect this to be very helpful. The main way diffusion models help is preventing answers like "No. [proceeds to explain why the answer is yes]", and since the blocks are so small, the LLM can't fully explain before it has to say yes or no.


Could you expound on this? From what I'm reading, this sounds like an issue with diffusion models that their block diffusion model is purposefully designed to mitigate, by conditioning on previous blocks and allowing for larger blocks if that conditioning still doesn't help maintain coherence.


It's an issue that you run into as long as you're forced to start with a yes/no answer. It's a problem forward-only LLMs have and diffusion models don't, and normal block diffusion is closer to forward LLMs than diffusion models.

You could increase the block size to act more like a full diffusion model, but you would lose some of the benefits of block diffusion.


Interesting. Makes me want to play around with an open diffusion LM. Do you have any recommendations?


My understanding here is block size can be arbitrarily large, under similar constraints as diffusion models. Is that not the case?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: