No, no it's been pretty easy with software engineering. I work on two types of projects and it's very easy to ask claude for a plan, then have gpt 5.5 rip it to shreds and find legit issues, and vice versa. If both 5.5 and claude 4.8 can independently create a plan and both find no critical or high issues, then we will be at that point.
Additionally running GPT-5.5 on medium sometimes gives me better results than high mode. On any of them I still have to push the models in the right direction.
I had a similar idea. I picked 10 lbs of morels last year, first time picking. It was a recent burn area from 8 months prior. I was just back out to the same area and there are no morels, but lots of small orange cap looking mushrooms. chatGPT pro said first year is the best and then it drops off on the second year. I might try a much higher elevation spot in a week or two, but it really sucks. Last year I was finding morels on southeast facing slopes. I'm sure north slopes produced later on as I saw people coming off the hill when I drove by.
North-facing (in the US) tends to produce earlier due to the increased warmth with south facing producing mid to late season. Fruiting has been suppressed by me due to lack of rain. Best of luck!
My maps aren’t in public release, but reach out if you want to give it a look.
It's a gimmick only for those who get sucked into buying things that they don't need. I've been a Costco shopper for decades, and sure have succumbed to some useless stuff, but my Costco list is 90% the same month to month. I get appalled when I see the same items on my list, that are smaller and in a pack of 1 instead of 2-4, for more money at other stores. If electronics were just like food, it would be like seeing a Macbook Pro for $2000 everywhere but it was $799 at Costco.
This is clever and provides a clean alternative to using custom plugins and mcp servers for doing code reviews.
For example, with the degradation of Claude in the past 1-2 months, I am always asking Codex to review Claude's plans and vice versa and I get excellent results that way.
Also, making a skill an API call allows for easy deployment if the security around tool calling could be isolated in an ephemeral sandbox.
Thanks! Sandbox deployment is planned in the roadmap. I already have a RuntimeAdapter interface in my architecture that I'll use to isolate the VMs. I'm doing exactly the same thing: I'm cross-referencing the models to challenge their plan, and my code reviewer agent's API is a big help.
I agree, I use codex 5.4 xhigh as my reviewer and it catches major issues with Opus 4.6 implementation plans. I'm pretty close to switching to codex because of how inconsistent claude code has become.
No, most systems in daily life can be understood if you are willing to take the time.
That doesn’t mean you personally are required to, but some people do and your interaction with the system of social trust determines how much of that remains opaque to you.
Yea I went through my global claude skills and /context yesterday because claude was performing terribly. I deleted a bunch of stuff including memory and anecdotally got better results later on in the day.
It’s shifting for knowledge workers too, we just need to pivot. I have had many app ideas for a while and now ai lets me build them quickly. Access to education and knowledge led to your advanced eduction, now access to cheap/fast building leads to products execution. Use your phd brain to come up with a well researched idea/plan and then go execute.
Those who are essentially vibe coding will find their code large, brittle, and unmaintainable beyond a size, contingent on its organization. They will be able to make 100x the toys but toys aren't what make the world work.
Yeah, but those are amateurs. But every developer like you and me are going to do the same, or be whipped to do the same. But the world only needs that many games, that many TODO apps, that many...so, either you are already a top developer, which ofc means you shouldn't worry, or else.
reply