Opus 4.5 catches all sorts of things a linter would not, and with little manual prompting at that. Missing DB indexes, forgotten migration scenarios, inconsistencies with similar services, an overlooked edge case.
Now I'm getting a robot to review the branch at regular intervals and poking holes in my thinking. The trick is not to use an LLM as a confirmation machine.
It doesn't replace a human reviewer.
I don't see the point of paying for yet another CI integration doing LLM code review.
I came to the same conclusion and ended up wiring a custom pipeline with LangGraph and Celery. The markup on the SaaS options is hard to justify given the raw API costs. The main benefit of rolling it yourself seems to be the control over context retrieval—I can force it to look at specific Postgres schemas or related service definitions that a generic CI integration usually misses.
In Norse mythology the first man Ask was carved out of a piece of ash tree and the first woman Embla out of a piece of elm. Ash is a good choice for tool handles and elm for constructing homes.
The thing that gets me is the assumption that we're not complex creatures who might each value different things at different times and in different contexts.
As for me, sometimes I code because I want something to do a specific thing, and I honestly couldn't be bothered to care how it happens.
Sometimes I code because I want something to work a very specific way or to learn how to make something work better, and I want to have my brain so deep in a chunk of code I can see it in my sleep.
Sometimes the creative expression is in the 'what' - one of my little startup tasks these days is experimenting with UI designs for helping a human get a task done as efficiently as possible. Sometimes it's in the 'how' - implementing the backend to that thing to be ridiculously fast and efficient. Sometimes it's both and sometimes it's neither!
A beautiful thing about code is that it can be a tool and it can be an expressive medium that scratches our urge to create and dive into things, or it can be both at the same time. Code is the most flexible substance on earth, for good and for creating incredible messes.
I'll argue the the LLM can be a great ally when "I want to have my brain so deep in a chunk of code I can see it in my sleep" because it can help you see the whole picture.
I think I can get on board with this view. In the earlier LLM days, I was working on a project that had me building models of different CSV's we'd receive from clients. I needed to build classes that represented all the fields. I asked AI to do it for me and I was very pleased with the results. It saved me an hour-long slog of copying the header rows, pasting into a class, ensuring that everything was camel-cased, etc. But the key thing here is that that work was never going to be the "hard part". That was the slog. The real dopamine hit was from solving the actual problem at hand - parsing many different variants of a file, and unifying the output in a homogenous way.
Now, if I had just said, "Dear Claude, make it so I can read files from any client and figure out how to represent the results in the same way, no matter what the input is". I can agree, I _might_ be stepping into "you're not gonna understand the software"-land. That's where responsibility comes into play. Reading the code that's produced is vital. I however, am still not at the point where I'm giivng feature work to LLMs. I make a plan for what I want to do, and give the mundane stuff to the AI.
Not typing every line of code myself doesn't divorce me from the construction.
I frequently find that the code I write using agents is better code, because small improvements no longer cost me velocity or time. If I think "huh, I should really have used a different pattern for that but it's already in 100+ places around the codebase" fixing it used to be a big decision... now it's a prompt.
None of my APIs lack interactive debugging tools any more. Everything that needs documentation is documented. I'm much less likely to take on technical debt - you take that on when fixing it would cost more time than you have available, but those constraints have changed for me.
But... that's exactly the kind of thing I'm referring to.
You're blanket replacing chunks of code without actually considering the context of each one.
Personally - I still have mixed feelings about it. The Hyatt Regency walkway was literally one of the examples brought up in my engineering classes about the risks of doing "simple pattern changes". I'm not referencing it out of thin air...
---
Havens Steel Company had manufactured the rods, and the company objected that the whole rod below the fourth floor would have to be threaded in order to screw on the nuts to hold the fourth-floor walkway in place. These threads would be subject to damage as the fourth-floor structure was hoisted into place. Havens Steel proposed that two separate and offset sets of rods be used: the first set suspending the fourth-floor walkway from the ceiling, and the second set suspending the second-floor walkway from the fourth-floor walkway.[22]
This design change would be fatal. In the original design, the beams of the fourth-floor walkway had to support the weight of the fourth-floor walkway, with the weight of the second-floor walkway supported completely by the rods. In the revised design, however, the fourth-floor beams supported both the fourth- and second-floor walkways, but were strong enough for 30% of that load.
---
Just use a different pattern? In this case, the steel company also believed it was a quick pattern improvement... they avoided a complex installation issue with threaded rods. Too bad it killed some 114 people.
But I am considering the context of each one. It's just quicker not to modify the code by hand.
I'm going to use a human comparison here, even though I try to avoid them. It's like having a team of interns who you explain the refactoring to, send them off to help get it done and review their work at the end.
If the interns are screwing it up you notice and update your instructions to them so they can try again.
I guess. And I don't mean that as a jab at you, I read a lot of your content and agree with quite a bit of it - I'm just personally conflicted here still.
I've worked in a couple positions where the software I've written does actually deal directly with the physical safety of people (medical, aviation, defense) - which I know is rare for a lot of folks here.
Applying that line of thinking to those positions... I find it makes me a tad itchy.
I think there's a lot of software where I don't really mind much (ex - almost every SaaS service under the sun, most consumer grade software, etc).
And I'm absolutely using these tools in those positions - so I'm not really judging that. I'm just wondering if there's a line we should be considering somewhere here.
I've avoided working directly on safety critical software throughout my career because the idea that my mistakes could hurt people frightens me.
I genuinely feel less nervous about working on those categories of software if I can bring coding agents along for the ride, because I'm confident I can use those tools to help me write software that's safer and less likely to have critical bugs.
Armed with coding agents I can get to 100% test coverage, and perform things like fuzz testing, and get second and third opinions on designs, and have conversations about failure patterns that I may not personally have considered.
For me, coding agents represent the ability for me to use techniques that were previously constrained by my time. I get more time now.
I share your concern. I'm flummoxed by the prevalent sentiment that code is the nasty underbelly of software. To me, a programming language is a medium both for precisely directing a computer's behavior and for precisely communicating a process to fellow programmers (cue the Alan Perlis quote [1].)
I will concede that mainstream code is often characterized by excessive verbosity and boilerplate. This I attribute to the immaturity of today's crop of programming languages. Techniques like language-oriented-programming [2] hint at a path I find appealing: DSLs that are tailored to the problem while promising more precision than a natural language specification could.
To speculate, I could see LLMs helping during the creation of a DSL (surfacing limitations in the DSL's design) and when adding features to a DSL, to help migrate code written in the old version to the new one.
Perhaps DSLs aren't the future. However will there be as much interest in designing new and superior programming languages now that code is seen as little more than assembly language?
No, I don't think so. But my context is different as is anyone's reply about their LLM usage.
I'm still creating software but with English that's compiled down to some other language.
I'm personally comfortable reading code in many languages. That means I'm able (hopefully!) to spot something that doesn't look quite right. I don't have to be the one pressing keys on the keyboard but I'm still accountable for the code i compile and submit.
I think you’re in the pleading stage. The AI tools this year will do the whole process without you. Reading the code will be a luxury for little benefit.
I get paid to work with Shadcn and was equally surprised by its radio button. I can't see why Radix implements it this way, and it doesn't work without JS either.
What I do like about Shadcn, is that its components are yours to modify. In our case, it was easy to replace it with a sensible label+input combination.
I don't think any of the Trump crowd thought as far as these legal ramifications. Send in the Little Green Men, annex, and figure things out as they happen.
Opus 4.5 is at a point where it is genuinely helpful. I've got what I want and the bubble may burst for all I care. 640K of RAM ought to be enough for anybody.
Shocking. I enjoyed Black Mirror from the start but found it a bit on the nose. A Prime Minister fucking a pig? That's a bit much. And before you know it, certain rumours of David Cameron arise.
I think that one PM episode was the only one totally out of whack. The rest were quite on point. There are dozens of episodes deep in the realm of reality.
Now I'm getting a robot to review the branch at regular intervals and poking holes in my thinking. The trick is not to use an LLM as a confirmation machine.
It doesn't replace a human reviewer.
I don't see the point of paying for yet another CI integration doing LLM code review.
reply