I think the weird thing about this is that it's completely true right now but in...

nico · on Oct 12, 2023

> I also believe that within say 1-3 years there will be a different type of training approach that does not require such large datasets or manual human feedback

This makes a lot of sense. A small model that “knows” enough English and a couple of programming languages should be enough for it to replace something like copilot, or use plug-ins or do RAG on a substantially larger dataset

The issue right now is that to get a model that can do those things, the current algorithms still need massive amounts of data, way more than what the final user needs

sidnb13 · on Oct 12, 2023

> I also believe that within say 1-3 years there will be a different type of training approach that does not require such large datasets or manual human feedback.

I guess if we ignore pretraining, don't sample-efficient fine-tuning on carefully curated instruction datasets sort of achieve this? LIMA and OpenOrca show some really promising results to date.

sharemywin · on Oct 12, 2023

distilbert was trained from Bert. there might be an angle using another model to train the model especially if your trying to get something to run locally.

Dwedit · on Oct 12, 2023

Abbreviate Mix of Experts as "MoE" and the Anime fans immediately start rushing in...