Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They’re basically markov chain text generators with a relevance-tracking-and-correction step. It turns out this is like 100x more useful than the same thing without the correction step, but they don’t really escape what they are “at heart”, if you will.

The ways they fail are often surprising if your baseline is “these are thinking machines”. If your baseline is what I wrote above (say, because you read the “Attention Is All You Need” paper) none of it’s surprising.



See also: 3 Blue 1 Brown's fantastic series on deep learning, particularly videos like "Transformers, the tech behind LLMs".

My own mental model (condensed to a single phrase) is that LLMs are extremely convincing (on the surface) autocomplete. So far, this model has not disappointed me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: