It's notable that Anthropic are still using SWEBench as a coding benchmark rather than the newer more difficult DeepSWE which shows them well behind GPT 5.5
Bear in mind that all the marketing efforts such as solving Erdos problem are the result of concerted RL training to impart those narrow capabilities, and how much of any benchmark results, or "early access" paid shill vibe reports, reflect improved performance for more general real-world use cases remains to be seen.
Well I have just tested it and GPT 5.5 is still smarter. It catches bugs that Fable doesn’t. Anthropic Fable is basically still sloppy like Opus 4.x. And I got also the downgrade for “cyber violations” trying to build a custom Debian ISO…that tells me their safeguards are sh**. I didn’t ask it to hack anything. Just to make a script that builds a custom Debian distribution with various settings…so this Fable thing seems like a flop&slop already. That warning plus the privacy change is the wake up call to move from Anthropic
Claude Code will write the whole thing for you. Whereas doesn’t Copilot require input along the way of coding? ie- it doesn’t do all the programming for you
Yes, of course, it can also span subagents, work for an hour without interactivity if that's what you want etc. just like any other harness.
Actually due to stupid billing system of github which charges per "premium request" instead of tokens, you could and still can abuse it so it costs nothing. They're changing it from next month to usage based billing though.
This always comes up and the only thing I can think is: Doesn't Google make like 10B a quarter in profit from GCP alone? Did we really need a cheaper SQL injection checker?
Anthropic and Claude are running circles around Google / Gemini for me these days. Anthropic was quite helpful for a while but strange limit issues started popping up. The final thread was a bug that essentially broke my ability to develop. I moved over to Claude Code full time and haven't looked back. Opus 4.6 is awesome for accelerating probabilistic programming!
reply