> You are an expert in coding (10,000 hours and all that) so you know when the code is wrong.
While I appreciate the suggestion that I might be an expert, I am decidedly not. That said, I’ve been writing what the companies I’ve worked for would consider “mission critical” code (mostly Java/Scala, Python, and SQL) for about twenty years, I’ve been a Unix/Linux sysadmin for over thirty years, and I’ve been in IT for almost forty years.
Perhaps the modernity and/or popularity of the languages are my problem? Are the models going to produce better code if I target “modern” languages like Go/Rust, and the various HTML/JS/FE frameworks instead of “legacy” languages like Java or SQL?
Or maybe my experience is too close to bare metal and need to focus on more trivial projects with higher-level or more modern languages? (fwiw, I don’t actually consider Go/Rust/JS/etc to be higher-level or more “modern” languages than the JVM languages with which I’m experienced; I’m open to arguments though)
> LLMs are insidious, it feeds into "everything is simple" concept a lot of us have of the world.
Yah, that’s what I mean when I say I feel gaslit.
> In reality, the LLM cannot do either role's job well.
I am aware of this. I’m not looking for an agent. That said, am I being too simplistic or unreasonable in expecting that I too could leverage these models (albeit perhaps after acquiring some missing piece of knowledge) as assistants capable of reasoning about my code or even the code they generate? If so, how are others able to get LLMs to generate what they claim are “deployable” non-trivial projects or refactorings of entire “critical” projects from the Python language to Go? Is someone lying or do I just need (seemingly dramatically) deeper knowledge of how to “correctly” prompt the models? Have I simply been victim of (again, seemingly dramatically) overly optimistic marketing hype?
We have a similar amount of IT experience, although I haven't been a daily engineer for a long time. I use aider.chat extensively for fun projects, preferring the Claude backend right now, and it definitely works. This site is 90% aider, give or take, the rest my hand edits: https://beta.personacollective.ai -- and it involves solidity, react, typescript and go.
Claude does benefit from some architectural direction. I think it's better at extending than creating from whole-cloth. My workflow looks like:
1) Rough out some code, say a smart contract with the key features
2) Tell claude to finish it and write extensive testing.
3) Run abigen on the solidity to get a go library
4) Tell claude to stub out golang server event handlers for every event in the go library
5) Create a react typescript site myself with a basic page
6) Tell claude to create an admin endpoint on the react site that pulls relevant data from the smart contracts into the react site.
6.5) Tell claude to redesign the site in a preferred style.
7) Go through and inspect the code for bugs. There will be a bunch.
8) For bugs that are simple, prompt Claude to fix: "You forgot x,y,z in these files. fix it."
9) For bugs that are a misunderstanding of my intent, either code up the core loop directly that's needed, or negotiate and explain. Coding is generally faster. Then say "I've fixed the code to work how it should, update X, Y, Z interfaces / etc."
10) for really difficult bugs or places I'm stumped, tar the codebase up, go to the chat interface of claude and gpto1-preview, paste the codebase in (claude can take a longer paste, but preview is better at holistic bugfixing), and explain the problem. Wait a minute or two and read the comments. 95% of the time one of the two LLMS is correct.
This all pretty much works. For these definitions of works:
1) It needs handholding to maintain a codebase's style and naming.
2) It can be overeager: "While I was in that file, I ..."
3) If it's more familiar with an old version of a library you will be constantly fighting it to use a new API.
How I would describe my experience: a year ago; it was like working with a junior dev that didn't know much and would constantly get things wrong. It is currently like working with a B+ senior-ish dev. It will still get things wrong, but things mostly compile, it can follow along, and it can generate new things to spec if those requests are reasonable.
All that to say, my coding projects went from "code with pair coder / puppy occasionally inserting helpful things" to "most of my time is spent at the architect level of the project, occasionally up to CTO, occasionally down to dev."
Is it worth it? If I had a day job writing mission critical code, I think I'd be verrry cautious right now, but if that job involved a lot of repetition and boiler plate / API integration, I would use it in a HEARTBEAT. It's so good at that stuff. For someone like me who is like "please extend my capacity and speed me up" it's amazing. I'd say I'm roughly 5-8x more productive. I love it.
This is very good insight, the likes of which I’ve needed; thank you. Your workflow is moderately more complex and definitely less “agentic” than I’d expected/hoped but it’s absolutely not out of line with the kind of complexity I’m willing to tackle nor what I’d personally expect from pairing with or instructing a knowledgeable junior-to-mid level developer/engineer.
Totally. It’s actually an interesting philosophical question: how much can we expect at different levels of precision in requirements, and when is code itself the most efficient way to be precise? I definitely feel my communication limits more with this workflow, and often feel like “well, that’s a fair, totally wrong, but fair interpretation.”
Claude has the added benefit that you can yell at it, and it won’t hold it against you. You know, speaking of pairing with a junior dev.
While I appreciate the suggestion that I might be an expert, I am decidedly not. That said, I’ve been writing what the companies I’ve worked for would consider “mission critical” code (mostly Java/Scala, Python, and SQL) for about twenty years, I’ve been a Unix/Linux sysadmin for over thirty years, and I’ve been in IT for almost forty years.
Perhaps the modernity and/or popularity of the languages are my problem? Are the models going to produce better code if I target “modern” languages like Go/Rust, and the various HTML/JS/FE frameworks instead of “legacy” languages like Java or SQL?
Or maybe my experience is too close to bare metal and need to focus on more trivial projects with higher-level or more modern languages? (fwiw, I don’t actually consider Go/Rust/JS/etc to be higher-level or more “modern” languages than the JVM languages with which I’m experienced; I’m open to arguments though)
> LLMs are insidious, it feeds into "everything is simple" concept a lot of us have of the world.
Yah, that’s what I mean when I say I feel gaslit.
> In reality, the LLM cannot do either role's job well.
I am aware of this. I’m not looking for an agent. That said, am I being too simplistic or unreasonable in expecting that I too could leverage these models (albeit perhaps after acquiring some missing piece of knowledge) as assistants capable of reasoning about my code or even the code they generate? If so, how are others able to get LLMs to generate what they claim are “deployable” non-trivial projects or refactorings of entire “critical” projects from the Python language to Go? Is someone lying or do I just need (seemingly dramatically) deeper knowledge of how to “correctly” prompt the models? Have I simply been victim of (again, seemingly dramatically) overly optimistic marketing hype?