Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder why some words are overrepresented. Isn't the whole idea of language models to model word distribution as close as possible? Does it have something to do with RLHF? Or it's the training data?


Language models would be fairly useless for most people if they accurately modelled the source distribution, no better than autocomplete. In fact, they were fairly useless when they modelled the source distribution, that's why ChatGPT was an instant hit whereas GPT-3 was mainly only interesting to other AI reasearchers.

What made LLMs suddenly interesting was that the responses were much more like answers and much less like additional questions in the same vein as the prompt.


>In fact, they were fairly useless when they modelled the source distribution, that's why ChatGPT was an instant hit whereas GPT-3 was mainly only interesting to other AI reasearchers.

I had a bot which used the original GPT3 (i.e. the completion model, not the chat model) and its answers were pretty decent (with the right prompting). Often even better than GPT3.5, whose answers were overly formulaic in comparison ("as an AI language model...", "it's important to ..." all the time)


I think that means you would count as "another AI developer" ^_^;




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: