Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yea, really if you look at human learning/seeing/acting there is a feedback loop that LLM for example isn't able to complete and train on.

You see an object. First you have to learn how to control all your body functions to move toward it and grasp it. This teaches you about the 3 dimensional world and things like gravity. You may not know the terms, but it is baked in your learning model. After you get an object you start building a classification list "hot", "sharp", "soft and fuzzy", "tasty", "slick". Your learning model builds up a list of properties of objects and "expected" properties of objects.

Once you have this 'database' you create as a human, you can apply the logic to achieve tasks. "Walk 10 feet forward, but avoid the sharp glass just to the left". You have to have spatial awareness, object awareness, and prediction ability.

Models 'kind of' have this, but its seemingly haphazard, kind of like a child that doesn't know how to put all the pieces together yet. I think a lot of embodied robot testing where the embodied model feeds back training to the LLM/vision model will have to occur before this is even somewhat close to reliable.



Embodied is useful, but I think not necessary even if you need learning in a 3D environment. Synthesized embodiment should be enough. While in some cases[0] it may have problems with fidelity, simulating embodied experience in silico scales much better, and more importantly, we have control over time flow. Humans always learn in real-time, while with simulated embodiment, we could cram years of subjective-time experiences into a model in seconds, and then for novel scenarios, spend an hour per each second of subjective time running a high-fidelity physics simulation[1].

--

[0] - Like if you plugged a 3D game engine into the training loop.

[1] - Results of which we could hopefully reuse in training later. And yes, a simulation could itself be a recording of carefully executed experiment in real world.


> Humans always learn in real-time

In the sense that we can't fast-forward our offline training, sure, but humans certainly "go away and think about it" after gaining IRL experience. This process seems to involve both consciously and subconsciously training on this data. People often consciously think about recent experiences, run through imagined scenarios to simulate the outcomes, plan approaches for next time etc. and even if they don't, they'll often perform better at a task after a break than they did at the start of the break. If this process of replaying experiences and simulating variants of them isn't "controlling the flow of (simulated) time" I don't know what else you'd call it.


> Like if you plugged a 3D game engine into the training loop

Isn't this what synthesized embodiment basically always is? As long as the application of the resulting technology is in a restricted, well controlled environment, as is the case for example for an assembly-line robot, this is a great strategy. But I expect fidelity problems will make this technique ultimately a bad idea for anything that's supposed to interact with humans. Like self-driving cars, for example. Unless, again, those self-driving cars are segregated in dedicated lanes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: