Right, which makes it a bit of a tricky attack to pull off. But if you know what you're doing you can do some operation that requires memory address x and be reasonably sure it will end up in the CPU cache. If you then do an operation on memory address x, and it happens really quickly, and you do an operation on memory address x+128, and it happens a bit slower, you can assume that x was in the cache and x+128 wasn't.
You load it into a register. If you're trying to drive it from a high level language, I guess you can do something like an add which will get compiled into instructions to load it into a register first.
I was under the impression that there is no interface to read data from the CPU caches and that the cache is managed by the CPU itself only.