...and that despite not being anywhere near as aggressive with exploiting UB as gcc or clang, which shows that backend-based optimisations like instruction selection, scheduling, and register allocation are far more valuable (and predictable).
I don't think anyone disputes that? Most optimizing compiler literature doesn't even mention language semantics, the gains there are very much last-ditch rather than necessary.
I can't even find benchmarks of ICC vs a current GCC but they were pretty even the best part of a decade ago. GCC is a mess compared to LLVM but it's quick.