In our case - agentic loop optimization of kernels. Works like a dream - after all, you have a perfect python spec (validation), the kernel is small (under 10 kloc), and the entire thing is testable. The trick is to have enough back test data so that things like cache behavior are taken into account. Ended up with different kernel versions for batch-back-test vs real-time work - which was interesting. 5 years ago would have hired about 10 ppl for the job - now 2.