Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LMArena (https://lmarena.ai/?chat-modality=image) currently has a model codenamed `nano-banana` that is generally strictly better than gpt-image-1

There's some speculation it's Gemini 3's multi-modal output, and other speculation that it's an OpenAI model. Hard to definitively since these models tend to hallucinate when interrogated.



Other than LMArena and a website I can't verify is authentic, it's hard for me to run tests on this new model but I have serious doubts that it'll pass my more difficult prompts such drawing a valid 2d maze with clearly marked exit and entrance.

gpt-image-1 is in a class all of its own with regards to prompt adherence in the "text to image" category.

Once it hits GA I'll put it through its paces and add it to the site!


I tested it with generating a man holding a Penrose triangle made of wood. While gpt-image-1 succeeded, nano-banana failed. The aesthetics of nano-banana did look much better though. I would guess that it is a diffusion model, based on the fact that it adds irrelevant but pretty background details, which gpt-image-1 tends to avoid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: