As an example of using semantic information, you might define a bunch of functio...

As an example of using semantic information, you might define a bunch of functions that take two natural numbers and return some result. Then you might write some proofs about those functions.

Let's say you write those proofs via induction over the natural numbers. You can choose between arguments to induct over if your functions take multiple inputs. The most effective choice often depends on the definitions of your functions. This means to pick the best argument, already you need to unfold your definition. Chances are you do that unfolding in your head, not using a tactic, or at least not one that you keep in your final proof.

So already, an ML technique for ITPs that does not unfold definitions in goals for more information is missing out on the essential piece that makes a user choose one argument for induction over another, even when syntactically the hypotheses and goals may look identical (I can define "add" to break down the first argument or the second argument, my call).

In general, I think the ITP community likes to pretend that we are using tactics to interact with a black box, but in practice, most of us have internalized heuristics that involve introspecting on the structures of terms and types. So any model that lacks the information to internalize those heuristics is missing out.