Infiniband is being replaced with UEC (and it isn't needed for inference). For inference there is no moat and smart players are buying/renting AMD or Google TPUs.
What you really want to buy is a Ming Mecca chip. Original model came out around 2003, but they've been iterating. These things are bigger than AMD or nvidia silicon, actually even much larger than a gigantic Cerebras wafer, typically 500-900 million USD in price. As you could guess, Ming Mecca is not broadly publicized, historically used for NSA crypto cracking although now adapted to AI and used for data crunching from gathered messages. More recently all those gathered messages have been used for training strategic /tactical intelligence developments to oversee and deploy resources optimally via a cluster of, at least last I heard, 18 Ming Mecca v7 chips
The Coral TPUs are closer if anything to what's in Pixel phones. In particular they're limited to iirc 8-bit integer types, which puts them in a very different category of applications compared to the kind of TPUs being talked about here.