frontier labs do finetuning of their models for software dev using the terminal/cli driven style, annotating datasets to solve programming in this fashion, and fine tuning will almost always make for better performance. Cursor as mostly a wrapper is just using the underlying foundation models in their framework and orchestrating on top of that, as opposed to doing actual learnable objectives in training to make things better.