Wow the model architecture seems extremely simple, very cool project.

MahiShafiullah · on Nov 29, 2023

Thank you! We tried to keep things as simple as possible on the policy side, but definitely there is a lot more room for innovation to go from 81% to 99%, like using temporal information and global structure of the task.

GaggiX · on Nov 29, 2023

Another limitation here is that the model is regressive, so for example if a task was to pick up one bottle out of two and the demos showed 50/50 of picking up one than the other, the model would output the mean even though it is not meaningful.

MahiShafiullah · on Nov 29, 2023

Indeed! In fact, I have a project [0] from last year that uses a GPT-style transformer to address that exact issue :) However, it’s hard to go far outside simulations in real home robotics without a good platform, out of which efforts came Dobb-E.

[0] https://mahis.life/bet/

GaggiX · on Nov 29, 2023

I've also seen the one that uses the diffusion process for planning, I imagine it's even slower, but maybe with a consistency loss something can be done about it.