This caught my attention today as it was in dispute and there were a lot of comments (moslty angry). I like browsing polymarket (never placed a bet myself but it is fun to see the prices). Always thought this would kill someone someday..
I find that Opus misses a lot of details in the code base when I want it to design a feature or something. It jumps to a basic solution which is actually good but might affect something elsewhere.
GPT 5.4 on codex cli has been much more reliable for me lately. I used to have opus write and codex review, I now to the opposite (I actually have codex write and both review in parallel).
So on the latest models for my use case gpt > opus but these change all the time.
Edit: also the harness is shit. Claude code has been slow, weird and a resource hog. Refuses to read now standardized .agents dirs so I need symlink gymnastics. Hides as much info as it can… Codex cli is working much better lately.
Codex CLI is so much more pleasant to use than CC. I cancelled my CC subscription after the OpenCode thing, but somewhat ironically have recently found myself naturally trying the native Codex CLI client first more often over OpenCode.
Kinda funny how you don't actually need to use coercion if you put in the engineering work to build a product that's competitive on its own technical merits...
Interesting! It did well for a first try. This was my prompt:
Lets play elevator saga! Here's the initial implementation:
{
init: function(elevators, floors) {
var elevator = elevators[0]; // Let's use the first elevator
// Whenever the elevator is idle (has no more queued destinations) ...
elevator.on("idle", function() {
// let's go to all the floors (or did we forget one?)
elevator.goToFloor(0);
elevator.goToFloor(1);
});
},
update: function(dt, elevators, floors) {
// We normally don't need to do anything here
}
}
I'd say this. I haven't had a Windows desktop computer with serious problems since 2007.
I had a long string of Windows laptops that were basically OK from maybe 2013 to 2023 except for problems with USB that got progressively worse over time (for each machine.) I think some of them were were real hardware problems but I think also the USB 3 spec doesn't guarantee that you can plug in very many devices and have it work, it depends on the PCIe architecture inside the machine. That "ding" sound when a USB device disconnects from windows has traumatized me and I've turned it off anywhere where I can because it is like a gunshot to a Vietnam vet.
I found very little literature about other Windows laptops users facing these problems but endless posts by AppleCare frequent fliers who seem to spend their lives at the Genius Bar and getting their old defective laptops replaced with new defective laptops, I think Windows users just expect it to be all screwed up.
For a long time Windows has struggled with processes that suck down a lot of resources at boot time. At home it is things that do software updates and saturate my 2x20Mbps internet connection. At work it is the backup program that saturates my Ethernet.
> I haven't had a Windows desktop computer with serious problems since 2007.
In roughly the last decade, I've had motherboards fail on me, drives fail on me, PSU fans have issues with the bearings (rattle even when it works), front USB headers not working, SATA SSDs overheat and fry themselves, HDDs fail, separate USB ports get fried, overheating issues, multi-GPU issues (that one's on me, OSes struggle with supporting those), sometimes systems even having bad performance like back when I had a Ryzen 5 4500 and Intel Arc A580 where games would run with low 1% lows BUT nothing would show up as the bottleneck in any monitoring program ever, I've had Windows bootloader freak out across updates, sometimes updates in the OS render themselves not installable for months, sometimes graphics drivers (AMD and Intel) having issues, especially with VR and so much more stuff.
I like to think that I have particularly bad luck and not high enough quality parts and sometimes just pretty jank setups due to the state of my wallet. I've also had laptops fall apart and phone batteries turn into pillows, so go figure. Also regular Debian/Ubuntu updates sometimes bringing down my homelab servers that also run on consumer hardware, so maybe it's definitely got something to do with a lack of luck.
Less so with higher quality parts and machines, like most of the ThinkPads I've owned have been pretty good and my current M1 MacBook Air is still okay (really good note taking machine for being on the move) and same for my iPhone SE, despite the OSes feeling kinda odd. Doesn't really condemn any individual setup in my eyes - as far as I'm concerned, they all suck to some degree and everything that can go wrong sooner or later will, that's just the way it is.
That said, I welcome more (relatively) affordable hardware with decent build quality - ofc running Linux distros or whatever else one desires on them would be nice too, as would more repairability.
I had a run of I think 4 laptops that I never paid for. I bought a spendy windows surfacebook back in the day, and opted for the extended warranty. The screen went out about a week before the warranty expired, they couldn't replace the screen, and so refunded me the entire purchase price, plus the cost of the warranty... which then went to the next laptop.
That cycle repeated itself either two or three more times, up to today, and my current laptop is I think going to be the one that finally last long enough that I'll have to actually pay for my next upgrade.
Mostly getting stuff done on the Windows machine in the next room or playing music off a different stereo or playing music on cassette tape or minidisc on the same stereo, etc. It's easier to fall back to a world of 20th century electronics where latency is imperceptible than it is to dive into a world of third-party apps that were all designed around somebody's inscrutable KPIs but didn't consider at all my convenience or inconvenience. Probably it is Creative Cloud updating or some software for the mouse or some kind of crap and if I sat in front of the machine for 30 minutes it might settle down but it's rare that I sit in front of it for 30 minutes. It used to be that kind of thing wrecked the Windows experience but over a long period of time Microsoft did a lot of work to balance to load of startup processes and mostly you don't feel it.
My wife browses the web a lot on that Mac, she hasn't complained since I installed Firefox + uBlock Origin but maybe she expects it to be slow.
I find time to first token more important then tok/s generally as these models wait an ungodly amount of time before streaming results. It looks like the claims are true based on M5: https://www.macstories.net/stories/ipad-pro-m5-neural-benchm... so this might work great.
Not sure. I'm a heavy AI user at this point. Oh, also a heavy Apple user and never once used an Apple AI thing since they released them. I don't even know what they released. It is complete failure of execution on their part.
Might work out fine on codex.
reply