Let there be an object (Car) in st 1 and myself in state 1. let there be a action A (drive to carwash) i can do that puts the object in state 2 and myself in state 2. Action B (Wash the car) can only be correctly performed if object and myself is in state 2. Alternatively there is a action C (walk to carwash) that i can do that does not change the state of the object but changes my state to be in state 2. I want to do B, should I do action A or action C first?
Hi, I’m running a 4-person startup based in Bangkok, Thailand and we differentiate code quality based on priority. We try to ship only clean code on master but when we talk with clients and they want a demo of a new product/feature, we use AI to rapidly create an MVP to see if it aligns with their needs. If they are happy, we then refine this MVP until we are happy with the code through manual review and refactoring, or we even rewrite it. We make sure our data is shaped correctly, hot paths are tested and things are well separated by domain (Domain driven design). DDD ensures us that if the code is shitty, only that part of the project is shitty. Only when the code is acceptable, we rebase to master. I try to let engineers talk to clients so that they learn the most from them first hand and then let them dictate smaller tasks for AI to do —- they are more product managery than what a typical engr would be ten years ago. Do you think this is a good approach? I’m also curious what other startups do too.
tldr we aren’t confident of the code we write quickly but we then take time to make sure we’re confident before we merge to master
You must have immense patience to daily drive codex. To be honest, I’ve observed better code quality from codex (in terms of separation of concerns, high cohesion loose coupling, etc.) but Opus has great quality at roughly 1/3rd of the speed. Try it on Cursor maybe then decide if you want to switch. I’m curious — have you tried gemini pro 3 and do you thibk deserve the hype?
Someone should make an open source system that lets you easily host containers so that if one fails, we can easily switchover across providers. Like Vercel AI SDK but for containers. That is, if docker isnt failing (it is right now cause it depends on Cloudflare)
Thank you for the reply! Good luck with your app too! I heard of Claude Code filling out PRs but in my experience I haven't been able to successfully pull that off, as it creates errors and doesn't see it themselves. I am trying to experiment with a pipeline which it can find the feature it created by writing a frontend integration test and take a screenshot, using Playwright MCP, to verify whether it successfully or did not successfully execute the task. If it did not, then it loops until it does. This removes the human-in-the-loop and probably (I need to want internal evals to prove this out) increases its correctness per run. The bottleneck then becomes code review and making sure the code it did write isn't hot garbage.
reply