Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most interesting thing to me is the spelling is correct.

I'm not a heavy user of AI or image generation in general, so is this also part of the new release or has this been fixed silently since last I tried?



It very much looks like a side effect of this new architecture. In my experience, text looks much better in recent DALL-E images (so what ChatGPT was using before), but it is still noticeably mangled when printing more than a few letters. This model update seems to improve text rendering by a lot, at least as long as the content is clearly specified.

However, when giving a prompt that requires the model to come up with the text itself, it still seems to struggle a bit, as can be seen in this hilarious example from the post: https://images.ctfassets.net/kftzwdyauwt9/21nVyfD2KFeriJXUNL...


The periodic table is absolutely hilarious, I didn't know LLMs had finally mastered absurdist humor.


Yeah who wouldn't love a dip in the sulphur pool. But back to the question, why can't such a model recognize letters as such? It cannot be trained to pay special attention to characters? How come it can print an anatomically correct eye but not differentiate between P and Z?


I think the model has not decided if it should print a P or a Z, so you end up with something halfway between the two.

It's a side effect of the entire model being differentiable - there is always some halfway point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: