Does Claude Code use a different model then Claude.ai? Because Sonnet 4 and Opus 4 routinely get things wrong for me. Both of them have sent me on wild goose chases, where they confidently claimed "X is happening" about my code but were 100% wrong. They also hallucinated APIs, and just got a lot of details wrong in general.
The problem-space I was exploring was libusb and Python, and I used ChatGPT and also Claude.ai to help debug some issues and flesh out some skeleton code. Claude's output was almost universally wrong. ChatGPT got a few things wrong, but was in general a lot closer to the truth.
AI might be coming for our jobs eventually, but it won't be Claude.ai.
The reason that claude code is “good” is because it can run tests, compile the code, run a linter, etc. If you actually pay attention to what it’s doing, at least in my experience, it constantly fucks up, but can sort of correct itself by taking feedback from outside tools. Eventually it proclaims “Perfect!” (which annoys me to no end), and spits out code that at least looks like it satisfies what you asked for. Then if you just ignore the tests that mock all the useful behaviors out, the amateur hour mistakes in data access patterns, and the security vulnerabilities, it’s amazing!
> Eventually it proclaims “Perfect!” (which annoys me to no end),
This has done wonders for me:
# User interaction
- Avoid sycophancy.
- Do what has been asked; nothing more, nothing less.
- If you are asked to make a change, summarize the change but do not explain its benefits.
- Be concise in phrasing but not to the point of omission.
You're right, but you can actually improve it pretty dramatically with sub agents. Once you get into a groove with sub agents, it really makes a big difference.
The problem-space I was exploring was libusb and Python, and I used ChatGPT and also Claude.ai to help debug some issues and flesh out some skeleton code. Claude's output was almost universally wrong. ChatGPT got a few things wrong, but was in general a lot closer to the truth.
AI might be coming for our jobs eventually, but it won't be Claude.ai.