I am not aware of any autoregressive transformer model that doesn't follow scaling laws. Tesla only needs to tokenize actions and images and voi la, self-driving capabilities. The problem, of course, is how you deploy such a big model so every car can run it locally.
Scaling parameter count requires a similar increase in the amount of (accurate, labeled) data. This can be mitigated by “bootstrapping” techniques that make labeling new data easier, but is still likely the bottleneck for training such a model effectively (assuming they can probably spin up a supercomputer to scale their models otherwise).
I don’t think I would do very well under Elon’s management style ha - the whole butts-in- seats from 8-6 thing is basically a dead end for me (personally).
I’m not sure what architecture they use, but they do indeed already have a pre trained “auto-labeler” that their annotators use. My understanding is that due to hallucinations from the model and the risks involved with driving, they still need to be vetted manually before being added to the dataset.
Makes sense. Fortunately, scale is less of a problem for the autolabeler, so they can put a lot of processing power on this step. But yeah, they will need human labelers for the edge cases where the model is unsure. I wish Tesla would publish their results so we could understand what they are doing and how much it is improving.