It will do a bit of that, but remember it has to work in real time and and can only look ahead so far. Give it a fighting chance by helping out whenever you can.
Tangent, but this reminds me of this great talk by Matt Godbolt from CppCon 2017: “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”
It's comments like these that make me realize I have a lot to learn about computers, I understand those words separately but together it sounds like a line from star trek.
CPUs can execute multiple instructions per clock cycle (recent x86_64 can do 4-5).
However, instructions can take 1 or (commonly) several clock cycles to complete, before their results are available for any instructions depending on said result to start executing. Such dependencies are the babe of superscalar parallelism.
But sometimes things that look like dependencies are fake.
At first glance it may look like you have to calculate these instructions serially. But, by renaming the last two `XMM0` you eliminate the dependency on the specific register, and can calculate instructions 1 and 3 in parallel, followed by 2 and 4 in parallel.
Cpus are designed with compiler optimizer experts in the loop. If the compiler can do it the cpu won't try. Instead the cpu does things that the optimizer can't do. Note that this goes both ways, if an optimizer won't use something the cpu won't do it, if the optimizer wants something the cpu will do it. (obviously within the limits of what is possible, and all the other trade offs)
Regiser renaming is designed assuming the optimizer