Discussions about LLM alignment often miss topics of data quality and quantity. It turns out that current models like Llama 2 use 10K+ prompts and responses for supervised fine-tuning (SFT) and 100K+ human preference pairs. While the preferences are pretty easy to annotate, producing a good SFT dataset is uneasy.
I read here that Yann LeCun claimed that even with RLHF, LLMs will still hallucinate - that it's an unavoidable consequence of their autoregressive nature
Moreover, it's all about use case. If you need a high degree of reliability and reproducibility, don't use LLMs! Not yet, at least. That's fine though, because there's a ton of value they offer in solving problems where that isn't needed.
> If you need a high degree of reliability and reproducibility, don't use LLMs!
This is true of pretty much all of machine learning. LLMs are just getting singled out because their outputs are not getting the same level of validation that typicall occurs with older approaches. BERT models will also spit out whacky stuff, depending on how they’re trained/fine-tuned/used/etc
When the next token is a URL, and the URL does not match the preceding anchor text.
Additional layers of these 'LLMs' could read the responses and determine whether their premises are valid and their logic is sound as necessary to support the presented conclusion(s), and then just suggest a different citation URL for the preceding text.
For many NLP tasks (which is what I mostly use LLMs for), hallucinations can be prevented with simple, procedural checks against the input or a controlled vocabulary. For example, for NER tasks, you can just check whether the extracted entities are valid relative to either of the two.
edit: I don't like your linked article at all. Subtly misleading and/or misinformed. Like a yahoo news but for ML.
to clarify: No one (certainly not OpenAI) suggested that RLHF was useful for reducing hallucinations. It's not for that. The insinuation that it was designed for that purpose (at least partially) and yet "failed" is a faulty one. It was not designed for that purpose. Hallucinations are a known issue with large language models, and while I appreciate LeCunn re-iterating that; lesser researchers than LeCunn are aware of that fact.
what people in the media are calling 'hallucination' in large language models is really imagination and creativity. it's inherent in the models and it's why they have been able to unlock so many amazing cognitive capabilities where previous efforts have failed
That's like saying that a simple linear regression predicting the wrong (x,y) value is just "being creative". The LLM is lacking in sufficient complexity to make the correct word prediction. It's not being creative, it is just flat out wrong.
Sure your linear regression example is great, and it's an example of how 'hallucination' in models can be good. Imagine if the only prediction algorithms had to fit every point in noisy data, it would look like either some overfitted thing like an ugly high order polynomial that goes through every point or just a database of every x y pair in the training set. Neither of those ones hallucinate if you ask the data from the training set. But for most purposes they wouldn't be as useful as a fitted linear regression, which 'hallucinates' y values even if you give it exact x from the training set. Of course if you just want to memorize and repeat back the training set then just a database is the best way!
So by your definition of creativity, the LLM is always being creative because it's always using probability to determine the next word (it's not just spitting out facts from a database). Hallucination is just when that prediction, or creativity as you call it, is wrong.
https://evalovernite.substack.com/p/rlhf-math-aint-enough
https://doi.org/10.5281/zenodo.8186168