"AI" as a term deserves this rep, because it was effectively marketing as far back as the 70s. But you're way off on "Machine Learning".
There's been plenty of progress in the last 15 years re-interpreting many ML methods as regression (any optimization is a regression if you set up the right likelihood function).
But many important results and techniques -- including today's ubiquitous deep nets -- originated and had successful applications way before they had statistical interpretations. They came from fields like compression theory, database design, or even biological interpretations.
The term Machine Learning was introduced to re-focus the field on a measurable objective: algorithms that improve with more data. The "Learning" part was not an abstract term to tug on your imagination, but included formal definitions of how algorithms improve that involved slightly fewer assumptions than statistical learning (which is a subfield).
This lineage isn't that important today, but that focus on how learning is measured is still the most important guidepost both for ML research and for sorting out marketing BS from realistic claims. Certainly, state of the art work using deep nets for tasks like NLP, image and video recognition aren't designed by reasoning about the statistical interpretation, or tested by applying typical statistical tests. Popularizing this work as Statistical Inference or Regression wouldn't give any added intuition and wouldn't really describe the way ML research proceeds, or how ML systems succeed or fail.
It works by fitting curves. Whether you have a (presumably mathematical) "Statistical interpretation" or not is basically irrelevant in terms of what it actually does and what we should be conveying to people who aren't knowledgeable of the field. This is not about an academic argument.
Putting stats right there in the name is vastly, vastly more informative than "Learning" which has the nuance for 99% of people as something requiring intelligence and is misleading. Hence the AI cons all pop up immediately there are some public ML wins called "Learning."
Generalizing from data is actually what statistics does. It's what ML is. People like Hinton, Wasserman, Tibrishani et al seem to agree that ML is statistics but even that isn't what I'm talking about here.
There's been plenty of progress in the last 15 years re-interpreting many ML methods as regression (any optimization is a regression if you set up the right likelihood function). But many important results and techniques -- including today's ubiquitous deep nets -- originated and had successful applications way before they had statistical interpretations. They came from fields like compression theory, database design, or even biological interpretations.
The term Machine Learning was introduced to re-focus the field on a measurable objective: algorithms that improve with more data. The "Learning" part was not an abstract term to tug on your imagination, but included formal definitions of how algorithms improve that involved slightly fewer assumptions than statistical learning (which is a subfield).
This lineage isn't that important today, but that focus on how learning is measured is still the most important guidepost both for ML research and for sorting out marketing BS from realistic claims. Certainly, state of the art work using deep nets for tasks like NLP, image and video recognition aren't designed by reasoning about the statistical interpretation, or tested by applying typical statistical tests. Popularizing this work as Statistical Inference or Regression wouldn't give any added intuition and wouldn't really describe the way ML research proceeds, or how ML systems succeed or fail.