Everyone here is asking about hardware encoding/decoding, which is obviously imp...

niftich · on April 10, 2018

Typically it's a dedicated hardware block on the die of a larger chip, like AMD's UVD [1] and VCE [2] inside their GPU/APUs, or Nvidia's PureVideo [3] and NVENC [4]. Sometimes the line is even blurrier, like the Broadcom SOC in the Raspberry Pi, where various more general-purpose processors can work together to decode [5].

Then you have APIs that try to disassemble all the typical stages of video codec processing that applications can call and GPU drivers can implement, which serve as a bridge between hardware-assisted decode and application code. These are APIs like DXVA or one of the several in use with Linux [6].

[1] https://en.wikipedia.org/wiki/Unified_Video_Decoder [2] https://en.wikipedia.org/wiki/Video_Coding_Engine [3] https://en.wikipedia.org/wiki/Nvidia_PureVideo [4] https://en.wikipedia.org/wiki/Nvidia_NVENC [5] https://github.com/hermanhermitage/videocoreiv [6] https://wiki.archlinux.org/index.php/Hardware_video_accelera...

theresistor · on April 10, 2018

In between, but more like the latter conceptually. It’ll be a collection of bulk data processing blocks that are glued together by software. Generally, you want to implement the large bulk operations in HW, with SW handling all the option parsing and control logic.

adrianratnapala · on April 10, 2018

And this opens up the question of how easy it is to re-target exiting hardware to new techniques.

For example, if I understand the Chroma-from-Luma prediction; then the maths involved is just 2-d linear regression (once for U vs. L once for V vs L.). That's a pretty generic task. Even when with the domain-specialisation that it is done over pixels in an encoding block, we are still talking about concepts common to all codecs.

So maybe there existing acceleration hardware can already do it. But even if it can, the require primitive needs to be exposed to software if new protocols are to benefit from it. So my question (and maybe Boxxed's too) is whether the hardware interface is low-level enough for such adaptability.