Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's still based on human games. It plays itself but the way it plays was inherited from human. I wonder if there is some fundamental barrier to what you can reach with reinforcement depending on your base.


Having it learn on human games was just a way of speeding up the initialization process before running reinforcement learning, it didn't limit the state tree that was being searched later on.


It is based on human games until it can explore well enough to sufficiently break away from local optimums.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: