That's not necessarily what matters. What matters is if there is a shared repres...

That's not necessarily what matters.

What matters is if there is a shared representation space across languages. If there is, you can then (theoretically, there might be a PhDs and a Nobel or two to be had :) separate underlying structure and the translation from underlying structure to language.

The latter - what they call the universal embedding inverter - is likely much more easily trainable. There's a good chance that certain structures are unique enough you can map them to underlying representation, and then lever that. But even if that's not viable, you can certainly run unsupervised training on raw material, and see if that same underlying "universal" structure pops out.

There's a lot of hope and conjecture in that last paragraph, but the whole point of the article is that maybe, just maybe, you don't need context to translate.