Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The two images shown in the article using the new method are sort of… stylized or slightly cartoonish in a way that the images they generated without using their method are not. Their images also have a “perfectly framed, looking straight at the camera,” which looks a little artificial. The images not using their method have a more natural look (although, obviously, they have the issue with the duplicated subject).

I wonder if it is an unavoidable result of their method, or if it is just a little issue (of course it is hard to get infinite compute as an academic, maybe they just need to train more. Is that a thing? I don’t AI).



Cartoonish output is a problem across the board. If you explicitly ask Dall-E for a "photograph" of something, you will very often get a result that looks like a cartoonified illustration. Prompt writers resort to specifying exact camera models and lenses to try to constrain the process.


There are fine tuned models out there that can generate near photo-realistic results. The base SD models and those offered by the major AI service sites have a more stylized look to them. Probably partially to work on a wider array of prompts that may include non photorealistic subjects, and partially for safety.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: