I'll be brutally honest: Not cool. It's old and boring and not in the least difficult. The software is what would've taken a while to make, and even that would theoretically consist of little more that two for loops to scan through each frame of video and output it to the audio jack.
Why? The oscilloscope has an x and y input to control the spot and it's controlled by the sound card output. I'm curious to know what about the explanation you don't find believable.
The explanation leads me to believe they're trying to say that the music you're listening to was fed in to the oscilloscope, and that produced the pretty images and text, which is clearly not what happened.
Thinking about it some more, I suppose they wrote a program to manipulate the soundcard output, but I'm still interested in the specifics.
> The explanation under the video doesn't pass my bullshit-o-meter.
Download the FLAC file and try for yourself then :)