LLMs also love to double down on solutions that don't work.
Case in point, I'm working on a game that's essentially a website right now. Since I'm very very bad with web design I'm using an LLM.
It's perfect 75% of the time. The other 25% it just doesn't work. Multiple LLMs will misunderstand basic tasks. Let's add properties and invent functions.
It's like you had hired a college junior who insists their never wrong and keeps pushing non functional code.
The entire mindset is whatever it's close enough, good luck.
God forbid you need to do anything using an uncommon node module or anything like that.
> LLMs also love to double down on solutions that don't work.
“Often wrong but never in doubt” is not proprietary to LLMs. It’s off-putting and we want them to be correct and to have humility when they’re wrong. But we should remember LLMs are trained on work created by people, and many of those people have built successful careers being exceedingly confident in solutions that don’t work.
When it comes to programming. Tell me you don't know so I can do something else. I ended up just refactoring my UX to work around it. In this case it's a personal prototype so it's not a big deal.
That is definitely an issue with many LLMs. I've had limited success including instructions like "Don't invent facts" in the system prompt and more success saying "that was not correct. Please answer again and check to ensure your code works before giving it to me" within the context of chats. More success still comes from requesting second opinions from a different model -- e.g. asking Claude's opinion of Qwen's solution.
To the other point, not admitting to gaps in knowledge or experience is also something that people do all the time. "I copied & pasted that from the top answer in Stack Overflow so it must be correct!" is a direct analog.
So now you have an overconfident human using an overconfident tool, both of which will end up coding themselves into a corner? Compilers at least, for the most part, offer very definitive feedback that act as guard rails to those overconfident humans.
Also, let's not forget LLMs are a product of the internet and anonymity. Human interaction on the internet is significantly different from in person interaction, where typically people are more humble and less overconfident. If someone at my office acted like some overconfident SO/reddit/HN users I would probably avoid them like the plague.
A compiler in the mix is very helpful. That and other sanity checks wielded by a skilled engineer doing code reviews can provide valuable feedback to other developers and to LLMs. The knowledgeable human in the loop makes the coding process and final products so much better. Two LLMs with tool usage capabilities reviewing the code isn't as good today but is available today.
The LLMs overconfidence is based on it spitting out the most-probable tokens based on its training data and your prompt. When LLMs learn real hubris from actual anonymous internet jackholes, we will have made significant progress toward AGI.
Case in point, I'm working on a game that's essentially a website right now. Since I'm very very bad with web design I'm using an LLM.
It's perfect 75% of the time. The other 25% it just doesn't work. Multiple LLMs will misunderstand basic tasks. Let's add properties and invent functions.
It's like you had hired a college junior who insists their never wrong and keeps pushing non functional code.
The entire mindset is whatever it's close enough, good luck.
God forbid you need to do anything using an uncommon node module or anything like that.