Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I did read it and even with their idea of focusing on a world model an AGI that can alsp operate on audio, images, and videos, being multimodal, will be more useful than one that operates purely on text.


I'm skeptical you read it because he doesn't make that argument. In fact i've literally never heard someone argue text-only is more useful than multimodal




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: