Keep in mind that you only have a tiny amount of L1 data cache lines. They are gone so quickly. If you can get a couple more struct instances in an array into those cache lines (without the cache lines holding unrelated nonsense as a byproduct of a memory fetch) that is a huge win.
The issue of L1 cache lines is more important than the size of the L1 cache. The granularity of the cache lines uses up the size of the cache very quickly if all cache lines are padded up with 3/4rd nonsense that you don't need right now.
Keep in mind that you only have a tiny amount of L1 data cache lines. They are gone so quickly. If you can get a couple more struct instances in an array into those cache lines (without the cache lines holding unrelated nonsense as a byproduct of a memory fetch) that is a huge win.
The issue of L1 cache lines is more important than the size of the L1 cache. The granularity of the cache lines uses up the size of the cache very quickly if all cache lines are padded up with 3/4rd nonsense that you don't need right now.