Deep-learning-based compression techniques may one day be able to get speech dow...

ghoul2 · on April 13, 2022

I am aware of lyra. Its pretty good. Very computationally expensive though - no audio application I have ever worked on had even close to the power/thermal budgets that would allow use of the deep-learning codecs. Maybe someday we will get very low-energy hardware accelerators for them, but until then, these are a non-starter for things I work on.

The thing is (and maybe this is a nitpick), once you are down to several hundred bps for speech, its getting to be more like speech-to-text (the encoder) and text-to-speech (decoder) than an audio codec.

I am actually not aware of any non-speech audio codecs which can go that low. Any links?

vletal · on April 13, 2022

When we have anough compute or the models get much smaller. At this point it seems wasteful to utilize gaming level GPU to decode audio stream.