Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Their loss curve with the RL didn't level off much though, could be taken a lot further and scaled up to more parameters on the big nvidia mega clusters out there. And the architecture is heavily tuned to nvidia optimizations.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: