Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why aren’t they using this technique to design better transformer architectures or completely novel machine learning architectures in general? Are plain or mostly plain transformers really peak? I find that hard to believe.


Because chip placement and the design of neural network architectures are entirely different problems, so this solution won't magically transfer from one to the other.


And AlphaGo is trained to play Go? The point is training a model through self play to build neural network architectures. If it can play Go and architect chip placements, I don’t see why it couldn’t be trained to build novel ML architectures.


Sure, they could choose to work on that problem. But why do you think that's a more important/worthwhile problem than chip design or any other problem they might choose to work on? My point was that it's not trivial to make self-play for some other problem work, so given all the problems in the world why did you single our neural network architecture design? Especially since it's not the transformer architecture that is really holding back AI progress.


Recursive self improvement




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: