You can navigate a website without visually decoding the image of a website.

bonoboTP · 2025-10-07T21:18:20 1759871900

Except if its a messy div soup with various shitty absolute and relative pixel offsets where the only way to know what refers to what is by rendering it and using gestalt principles.

measurablefunc · 2025-10-07T21:53:01 1759873981

None of that matters to neural networks.

bonoboTP · 2025-10-07T22:00:58 1759874458

It does, because it's hard to infer where each element will end up in the render. So a checkbox may be set up in a shitty way such that the corresponding text label is not properly placed in the DOM, so it's hard to tell what the checkbox controls just based on the DOM tree. You have to take into account the styling and placement pixel stuff, ie render it properly and look at it.

That's just one obvious example, but the principle holds more generally.

measurablefunc · 2025-10-07T22:03:53 1759874633

Spatial continuity has nothing to do w/ how neural networks interpret an array of numbers. In fact, there is nothing about the topology of the input that is any way relevant to what calculations are done by the network. You are imposing an anthropomorphic structure that does not exist anywhere in the algorithm & how it processes information. Here is an example to demonstrate my point: https://x.com/s_scardapane/status/1975500989299105981

bonoboTP · 2025-10-07T22:26:04 1759875964

It would have to implicitly render the HTML+CSS to know which two elements visually end up next to each other, if the markup is spaghetti and badly done.

measurablefunc · 2025-10-07T22:37:54 1759876674

The linked post demonstrates arbitrary re-ordering of image patches. Spatial continuity is not relevant to neural networks.

bonoboTP · 2025-10-07T22:49:38 1759877378

That's ridiculous, sorry. If that were so, we wouldn't have positional encodings in vision transformers.

measurablefunc · 2025-10-07T22:58:16 1759877896

It's not ridiculous if you understand how neural networks actually work. Your perception of the numbers has nothing to do w/ the logic of the arithmetic in the network.

bonoboTP · 2025-10-07T23:19:12 1759879152

Do you know what "positional encoding" means?

measurablefunc · 2025-10-07T23:21:58 1759879318

Completely irrelevant to the point being made.

ionwake · 2025-10-07T22:29:35 1759876175

Why are you talking about image processing ? The guy you’re talking to isn’t

measurablefunc · 2025-10-07T22:34:43 1759876483

What do you suppose "render" means?

bonoboTP · 2025-10-07T22:48:18 1759877298

The original comment I replied to said "You can navigate a website without visually decoding the image of a website." I replied that decoding is necessary to know where the elements will end up in a visual arrangement, because often that carries semantics. A label that is rendered next to another element can be crucial for understanding the functioning of the program. It's nontrivial just from the HTML or whatever tree structure where each element will appear in 2D after rendering.

measurablefunc · 2025-10-07T23:00:38 1759878038

2D rendering is not necessary for processing information by neural networks. In fact, the image is flattened into 1D array & loses the topological structure almost entirely b/c the topology is not relevant to the arithmetic performed by the network.

bonoboTP · 2025-10-07T23:39:11 1759880351

I'm talking about HTML (or other markup, in the form of text) vs image. That simply getting the markup as text tokens will be much harder to interpret since it's not clear where the elements will end up. I guess I can't make this any more clear.

ionwake · 2025-10-08T09:23:21 1759915401

The guy you are talking to is either an utter moron, severely autistic, or for some weird reason he is trolling ( it is a fresh account. I applaud you for trying to be kind and explain things to him, I personally would not have the patience.

measurablefunc · 2025-10-08T18:29:10 1759948150

Calm down gramps, it's not good for the heart be angry all the time.

ionwake · 2025-10-09T12:07:20 1760011640

Im not angry, Im disappointed. People are going out of their way to help you understand a topic, and the best you can do is be patronising? It means you are over confident, ignorant, rude, slow to learn, disrespectful and I think ungrateful.

If you read through the thread - to me its apparent.

Even if you make burner accounts, this behaviour doesn't help one grow.

But hey gramps could be wrong eh

measurablefunc · 2025-10-09T22:54:49 1760050489

Relax buddy, it's not that serious.

ionwake · 2025-10-10T10:05:32 1760090732

Yeah you are right, sorry bro I didnt have enough coffee yesterday. My bad.

measurablefunc · 2025-10-10T16:26:31 1760113591

No one is perfect.