> In a formal sense, anything you can compute on a computer is O(1)
I don't think the definition of Big-Oh supports that conclusion.
"Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity." [1]
"For example, the time (or the number of steps) it takes to complete a problem of size n might be found to be T(n) = 4n^2 − 2n + 2.... As n grows large, the n^2 term will come to dominate,... say that the algorithm has order of n^2 time complexity" [1]
> Hash table lookup is absolutely bounded by a constant, but going from L1 to L2 to L3 to RAM with anywhere between 0 and 5 TLB misses is far from constant.
Indeed, but I can choose a value "C" that is large enough to negate any of the differences between cache miss or hit.
Let's say "N" is the number of lines in one of our text files. Once we have put that data into a hashtable, the claim is then that lookup time is O(1). You counter, and state, no, the lookup time is potentially 30ms, or even 100ms (making up the numbers). For my claim that the lookup time of O(1) to be true, I can pick any value of 'C'. So I choose 1 million. I'm basically saying, lookup times will always be under 1 hour. Given our precondition that we have built this hashtable, no matter how we get the data, yes - it will be under 1 hour. If we used carrier pigeon with USB stick, we could just choose a larger value for C and still be correct.
The idea is basically, if you have something like N^2 + N^1, eventually when N is large enough, n^2 will become so dominant that you could multiple by it by a fixed value to wash away all the other terms in the expression.
Thus, while the actual lookup time can and will vary, it is still bounded. Keep in mind as well there are different functions for best case and average case performance. These notations do take into account that the per-instance lookup time can vary.
I don't think the definition of Big-Oh supports that conclusion.
"Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity." [1]
"For example, the time (or the number of steps) it takes to complete a problem of size n might be found to be T(n) = 4n^2 − 2n + 2.... As n grows large, the n^2 term will come to dominate,... say that the algorithm has order of n^2 time complexity" [1]
[1] https://en.wikipedia.org/wiki/Big_O_notation
> Hash table lookup is absolutely bounded by a constant, but going from L1 to L2 to L3 to RAM with anywhere between 0 and 5 TLB misses is far from constant.
Indeed, but I can choose a value "C" that is large enough to negate any of the differences between cache miss or hit.
Let's say "N" is the number of lines in one of our text files. Once we have put that data into a hashtable, the claim is then that lookup time is O(1). You counter, and state, no, the lookup time is potentially 30ms, or even 100ms (making up the numbers). For my claim that the lookup time of O(1) to be true, I can pick any value of 'C'. So I choose 1 million. I'm basically saying, lookup times will always be under 1 hour. Given our precondition that we have built this hashtable, no matter how we get the data, yes - it will be under 1 hour. If we used carrier pigeon with USB stick, we could just choose a larger value for C and still be correct.
The idea is basically, if you have something like N^2 + N^1, eventually when N is large enough, n^2 will become so dominant that you could multiple by it by a fixed value to wash away all the other terms in the expression.
Thus, while the actual lookup time can and will vary, it is still bounded. Keep in mind as well there are different functions for best case and average case performance. These notations do take into account that the per-instance lookup time can vary.