Perhaps even more relevant to your argument, most classes of problems lend thems...

Perhaps even more relevant to your argument, most classes of problems lend themselves to an adaptive learning rate. I.e. we would like to learn our separator or weights quickly when we have no information but we want learning to slow as we have more information (so that we can settle towards a solution representative of the data vs the last batch/data point/hypothesis update)

Methods like Ridge Regression also support your point. There is a lot of value to defining how learning takes place, and specifically more learning is often bad for our results(in ml).