And this opens up the question of how easy it is to re-target exiting hardware to new techniques.
For example, if I understand the Chroma-from-Luma prediction; then the maths involved is just 2-d linear regression (once for U vs. L once for V vs L.). That's a pretty generic task. Even when with the domain-specialisation that it is done over pixels in an encoding block, we are still talking about concepts common to all codecs.
So maybe there existing acceleration hardware can already do it. But even if it can, the require primitive needs to be exposed to software if new protocols are to benefit from it. So my question (and maybe Boxxed's too) is whether the hardware interface is low-level enough for such adaptability.
For example, if I understand the Chroma-from-Luma prediction; then the maths involved is just 2-d linear regression (once for U vs. L once for V vs L.). That's a pretty generic task. Even when with the domain-specialisation that it is done over pixels in an encoding block, we are still talking about concepts common to all codecs.
So maybe there existing acceleration hardware can already do it. But even if it can, the require primitive needs to be exposed to software if new protocols are to benefit from it. So my question (and maybe Boxxed's too) is whether the hardware interface is low-level enough for such adaptability.