Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> They trained on publicly-available (no signup with TOS agreement) data, on the theory that training is fair use.

They openly state they used thousands of books from a pirate site as a training source. Go look up the datasets listed in the GPT-3 paper.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: