Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It would have to implicitly render the HTML+CSS to know which two elements visually end up next to each other, if the markup is spaghetti and badly done.


The linked post demonstrates arbitrary re-ordering of image patches. Spatial continuity is not relevant to neural networks.


That's ridiculous, sorry. If that were so, we wouldn't have positional encodings in vision transformers.


It's not ridiculous if you understand how neural networks actually work. Your perception of the numbers has nothing to do w/ the logic of the arithmetic in the network.


Do you know what "positional encoding" means?


Completely irrelevant to the point being made.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: