I don’t understand this argument. If the CPU dissipated an equal number of watts of heat energy as it consumed from the wall, there wouldn’t be any energy left to do actual useful work. Isn’t the extra 100W accounted for by things like changing the state of flip-flops? In other words, mustn’t one consider the entropy reduction of the system as an energy sink?
Clocking and changing register states requires charging and discharging the gate capacitance of a bunch of MOSFET transistors. The current that results from moving all that charge around encounters resistance, which converts it to heat. Silicon is only a "semi" conductor after all.
You are correct that there is energy bound in the information stored in the chip. But last I checked, our most efficient chips (e.g., using reversible computing to avoid wasting that energy) are still orders of magnitude less efficient than those theoretical limits.
Thank you for encouraging me to go on this educational adventure. I have now heard of Landauer’s principle, which says each bit of information releases 2.9e-21 joules when destroyed: https://en.wikipedia.org/wiki/Landauer%27s_principle
I think the numbers are more like <1W used in actual information processing, >239W lost to heat. Information and the transformation of it does have some inherent energy cost. But it is very, very small. And you end up getting that back as heat somewhere else down the line anyways.
Nope. Remember that you cannot destroy energy. The energy you use to flip the flip flop still exists, only now it’s just disordered waste heat instead of electricity.
Energy cannot be created or destroyed, but it can enter and leave an open system. When I lift a 10kg box 1 meter in the air, I don’t raise its temperature at all, and I only raise mine a tiny bit, yet I have still done work on the box and therefore have imparted it energy. The energy came from food I ate earlier, and was ultimately stored in the box as gravipotential energy.
Is this not analogous to storing energy in the EM fields within the CPU?
CPUs don't store nontrivial amounts of energy, and even if storing a 1 was a significantly higher energy level than a 0 (or vice versa) there's no plausible workload that would be causing the CPU to switch significantly more 0s to 1s than 1s to 0s (or vice versa).
Yes, but only briefly. When you study the thermodynamics of information you’ll discover that it’s actually erasing information that has a cost. Every time the CPU stores a value in a register it erases the previous value, using up energy. In fact, every individual transistor has to erase the previous state on basically every clock cycle.
Curiously there is a minimum cost to erase a single bit that no system can go below. It’s extremely small, billions of times smaller than the amount of energy our CPUs use every time they erase a bit, but it exists. Look up Landauer’s Limit. There is a similar limit on the maximum amount of information stored in a system which is proportional to the surface area of the sphere that the information fits inside. Exceed that limit and you’ll form a black hole. We’re no where near that limit yet either.
>In fact, every individual transistor has to erase the previous state on basically every clock cycle.
This is incorrect in both directions.
Only transistors whose inputs are changing have to discharge their capacitance.
This means that if the inputs don't change nothing happens, but if the inputs change then the changes propagate through the circuit to the next flip flop, possibly creating a cascade of changes.
Consider this pathological scenario: The first input changes, then a delay happens, then the second input changes so that the output remains the same. This is known as a "glitch". Even though the output hasn't changed, the downstream transistors see their input switch twice. Glitches propagate through transistors and not only that, if another unfortunate timing event happens, you can end up with accumulating multiple glitches. A single transistor may switch multiple times in a clock cycle.
Switching transistors costs energy, which means you end up with "parasitic" power consumption that doesn't contribute to the calculated output.
My apologies if I wasn’t clear enough. I was only intending to make a statistical statement that the number of erasures is of similar order to the number of transistors, not that every single transistor changes its state exactly once per cycle. Some don't change their state this cycle, others end up changing multiple times before settling. In fact, some are completely powered off! (Because you’re not using the built–in GPU right now, or you’re not doing AVX512 right now, etc, etc.)
Note also that discharging the internal capacitance of a transistor, and the heat generated by current through the transistor’s internal resistance, are both costs over and above the fundamental cost of erasing a bit. Transistors can be made more efficient by reducing those additional costs, but Landauer discovered that nothing can reduce the fundamental cost of erasing a bit.