We AB tested 16-44.1 and 24-96 versions of some really good classical recordings...

deaddodo · on Oct 1, 2023

I love how you can pull out 100 studies and side by side comparisons of recording tools/listening devices much more precise than the human ear that all show this as being flim-flam; and still "audiophiles" will convince themselves to spend 5-25k on specialty equipment that has no effect on their experience.

You're better off spending your money on a bog standard DAC/AMP (feel free to opt for tube even, if you insist) combo running through a pair of decent headphones off of 320kbps MP3/AAC (or FLAC, if you insist) source. Even, if we took your subjective insistance that this specialty equipment improved your experience by .00001%, it's probably not worth the 500-1500% increase in expense.

As to your specific example, I can guarantee you that your Bluetooth codec (LDAC or not) introduced far more sound artifacts than the difference between 16 and 24-bit sound.

KeplerBoy · on Oct 2, 2023

Placebo is one hell of a drug and has an effect much larger than .00001%. It might not pass an ABX test, but it absolutely does sound better to those who want it to sound better.

zigzag312 · on Oct 1, 2023

I would be interested in those studies if you can link a few of them.

Mistletoe · on Oct 1, 2023

Was your test blinded? I guess there is a chance you are an outlier but blind tests like this one don't support what you are saying.

http://archimago.blogspot.com/2014/06/24-bit-vs-16-bit-audio...

>In a naturalistic survey of 140 respondents using high quality musical samples sourced from high-resolution 24/96 digital audio collected over 2 months, there was no evidence that 24-bit audio could be appreciably differentiated from the same music dithered down to 16-bits using a basic algorithm (Adobe Audition 3, flat triangular dither, 0.5 bits).

>Furthermore, analysis of those utilizing more expensive audio systems ($6,000+) did not show any evidence of the respondents being able to identify the 24-bit audio. Those using headphones likewise did not show any stronger preference for the higher bit-depth sample. No difference was noted in the "older" (51+ years) age group data (not surprising if there is no discernible difference even with potential age-related hearing acuity changes).

amlib · on Oct 1, 2023

The 24-96 is different master, some sound engineer just had a field day in the studio and produced a better mix. Repeat the test with a 16-44.1 version downsampled (use something like sox with the ultra high quality resmapler) from the 24-96 version and I guarantee you will not be able to spot any difference compared to the "true" 24-96 version.

saltcured · on Oct 2, 2023

I understand the theory of this, but cannot reconcile it with an experience I had in person a decade ago.

In the acoustically prepped monitoring booth of his recording studio, a friend of mine tried to give me an ABX test of 24bit 96 kHz recording and its 16bit 44.1 kHz rendering that was supposedly done right. I heard the difference and easily picked the high-rate one that sounded more life-like. With my best effort, I described it as having a clearer high frequency spectrum, while the other sounded muffled in comparison.

I am left wondering if the 44.1 kHz file wasn't actually rendered correctly with dithering, or if my friend failed to actually get his studio equipment to play it back correctly. I.e. was some overly aggressive low-pass filter done during the conversion or during the playback.

amlib · on Oct 2, 2023

As you said yourself, it was most likely a rendering issue. A bad low-pass filter would have attenuated the high-end when converting to 44khz. Also, this is afaik the reason why all modern audio uses 48khz, you get a little bit more head-room when designing a low pass filter and you can even choose a less aggressive and perhaps less computationally expensive one that still won't have an effect on the humanly perceptible frequencies.

publicmail · on Oct 2, 2023

I think the reason a lot of modern audio is at 48k has more to do with it being accompanied by video, which has independently settled on sampling rates of 48k, 96k, etc.

eredengrin · on Oct 1, 2023

How do you know that the 24/96 and 16/44 came from the same masters? If this isn't controlled for then of course the result might be different.[0]

Also, what is xm1000w3s? I can't find any record of this so I'm guessing maybe it is referring to the WH1000XM3 headphones? Given ldac is also mentioned this seems a reasonable guess as it's a bluetooth model. If that's the case I wouldn't call it "good listening equipment", the default frequency response curve of the wh1000xm3 is incredibly bad, it's barely worth listening to classical music on without using AutoEq[1] or something equivalent (I have a pair and it's much worse than my old Ath M50s which were like half the price). The bass heavy curve of the headphones is far more noticeable than any difference between 16/24 bit audio would ever make.

[0] https://people.xiph.org/~xiphmont/demo/neil-young.html#toc_d...

[1] https://autoeq.app/

eviks · on Oct 1, 2023

I'd rather trust solid hearing biology/physics plus all the other failed tests

> the effective dynamic range of 16 bit audio reaches 120dB in practice [13], more than fifteen times deeper than the 96dB claim.

> 120dB is greater than the difference between a deserted 'soundproof' room and a sound loud enough to cause hearing damage in seconds.

> 16 bits is enough to store all we can hear, and will be enough forever.

https://people.xiph.org/~xiphmont/demo/neil-young.html#toc_1...

user_7832 · on Oct 1, 2023

>> the effective dynamic range of 16 bit audio reaches 120dB in practice [13], more than fifteen times deeper than the 96dB claim.

> 120dB is greater than the difference between a deserted 'soundproof' room and a sound loud enough to cause hearing damage in seconds.

> 16 bits is enough to store all we can hear, and will be enough forever.

Correct me if I'm wrong, but isn't 16 bit = 120db about the levels of gradations of sound? Even a 4 bit = 16 levels of sound pressure/SPL could go from 20db, 20+12.5=32.5db, 32.5+12.5db and so on until 120db.

Then, the important question is what's the minimum SPL difference perceptable (at a given spl level). That may well not be 1db.

rplst8 · on Oct 1, 2023

That's not how it works. Each bit of sample size yields about 6db of SNR. If you amplify a source to 120db SPL that was recorded with 4-bit samples the quantization noise would be about 96db SPL.

danbee · on Oct 2, 2023

Decibels is a logarithmic (and relative) unit, not linear. So each bit represents 6db or a doubling of amplitude.

jbverschoor · on Oct 1, 2023

We also can't see more than 60 fps according to so much research. And why would we want 10 bit screens?

I checked out the link, and the Sample 2 file does not represent any wave and is not audible, so the article contradicts itself.

mrob · on Oct 1, 2023

What exactly do you mean by "see more than 60fps"? It's possible that 60fps video with full temporal antialiasing and low to moderate motion speed could fool untrained viewers, but if I'm allowed to move my eyes I can tell the difference between high frame rate video (simulated with strobing LEDs because of lack of suitable video hardware) and real-life motion well into the thousands of frames per second. This isn't an unusual ability:

https://journals.sagepub.com/doi/10.1177/1477153512436367

Note that 2kHz flicker requires 4000fps to be displayed as video.

deaddodo · on Oct 1, 2023

I think people are also equating apples to oranges here. Vision is analog. There is no "DPI" or "FPS" that human vision can see. Some types of motion the human eye can perceive at thousands of "frames" and others it can only perceive at 60, some colors (green) and contrasts it can distinguish extremely fine detail in and other's (blue), it cannot. Ultimately it's variable and non-digital so it's never going to equate to some strict terms.

The audio, on the other hand, that reaches your ears comes from an analog source, even if it ends up digital in between. There aren't some resolution arguments to be made here, all that matters is that the output device can accurately reproduce the proper analog signal. Which has been proven time and time again, and that any simplification of said signal is imperceptible to anything but the most finely tuned listening devices (or maybe some special "golden ears" that the vast majority of audiophiles don't belong to).

ReactiveJelly · on Oct 1, 2023

We would want 10 bit screens because the research indicates that the dynamic range of human vision is around 90 dB or 1:1,000,000,000, which is alarmingly higher than even 1:1,024

https://en.wikipedia.org/wiki/Dynamic_range#Human_perception

If all research is wrong, I'm gonna start drinking vinegar and building perpetual motion machines :P

jbverschoor · on Oct 1, 2023

According to Pantone, "Researchers estimate that most humans can see around one million different colours". So research says we only need 7 bits.

"Research".. sponsored by corporations, and peer-checked by scientific voting rings. A bunch of incrowd elitists who like to use jargon. Science and politics these days are pretty similar

crthpl · on Oct 1, 2023

The 7 million is probably how many different hues we can see. We can see many more different brightness levels.

jbverschoor · on Oct 2, 2023

The article that mentioned it already includes all the different components, lightness etc

Eisenstein · on Oct 1, 2023

Are you both talking about the same thing? Is dynamic range the same thing as 'number of colors'?

Eisenstein · on Oct 2, 2023

I think a lot of your objections to 'science' are due to basic communication misunderstandings and taking things you heard second hand at face value as 'science'. It would probably help to decouple yourself from the notion that a pared down snippet heard from a journalist or on a website is actually what the studies are saying.

chadaustin · on Oct 1, 2023

Where does this “can’t see more than 60 fps” rumor come from?

It’s trivially refutable by placing a 60 Hz strobe (e.g. old fluorescent light or even some aftermarket headlights) at the corner of your vision.

Also, for interactive systems, 16 ms is a large chunk of our reaction time. You need close to 1 ms response times (1000 fps) to approximate pen and paper.

jbverschoor · on Oct 1, 2023

I don’t know where it came from.. it was already there in the CRT times.

A simple google on 60 fps will still show these “scientists” who claim that we can perceive anything higher than 30-60 fps.

“Science” does NOT equal truth.

Eisenstein · on Oct 1, 2023

You seem to be the only one claiming this bit of 'science'. No one else has heard of this claim.

manderley · on Oct 2, 2023

With CRT monitors, different refresh rates very really easy to spot - 60 Hz was very flickery.

wtallis · on Oct 2, 2023

Yeah, 60Hz on a CRT was more or less the minimum tolerable refresh rate, and 75-85Hz was noticeably better. And that's just for trying to display a static image without distracting flickering. Displaying smooth motion is a lot harder.

eviks · on Oct 2, 2023

Try to do better than a simple google, maybe you'll actually stumble on real science which would help understand the difference between the linked claims about hearing and yours about vision

Lapha · on Oct 1, 2023

The topic of human vision and perception is complex enough that I very much doubt it's scientists who are making the claim that we can't perceive anything higher than 30-60fps. There's various other effects like fluidity of motion, the flicker fusion threshold, persistence of vision, and strobing effects (think phantom array/wagon wheel effects), etc, which all have different answers. For example, the flicker fusion threshold can be as high as 500hz[0], similarly strobing effects like dragging your mouse across the screen are still perceivable on 144hz+ and supposedly 480hz monitors.

As far as perceiving images goes, there's a study at [1] which shows people can reliably identify images shown on screen for 13ms (75hz, the refresh rate of the monitor they were using). That is, subjects were shown a sequence of 6-12 distinct images 13ms apart and were still able to reliably able to identify a particular image in that sequence. What's noteworthy is this study is commonly cited for the claims that humans can only perceive 30-60fps, despite the study addressing a completely separate issue to perception of framerates, and is a massive improvement over previous studies which show figures as high as 80-100ms, which seems like a believable figure if they were using a similar or worse methodology. I can easily see this and similar studies being the source of the claims that people can only process 10-13 images a second, or perceive 30-60 fps, if science 'journalists' are lazily plugging something like 1000/80 into a calculator without having read the study.

There's also the old claim [2] from at least 2001 that the USAF studied fighter pilots and found that they can identify planes shown on screen for 4.5ms, 1/220th of a second, 1/225th of a second, or various other figures, but I can't find the source for this and I'm sure it's more of an urban legend that circulated gaming forums in the early 2000s than anything. If it was an actual study I'm almost certain perception of vision played a role in this, something the study at [1] avoids entirely.

[0] 'Humans perceive flicker artifacts at 500 Hz' https://pubmed.ncbi.nlm.nih.gov/25644611/

[1] 'Detecting meaning in RSVP at 13 ms per picture' https://dspace.mit.edu/bitstream/handle/1721.1/107157/13414_...

[2] http://amo.net/nt/02-21-01fps.html

nyolfen · on Oct 1, 2023

"you need more than anecdotal evidence"

"have some anecdotal evidence"

replete · on Oct 2, 2023

Apologies, that flippant comment was super rude and juvenile and I am taking it as a signal that I really need a break, I don't normally speak in this way. Yikes.

Also totally fair, I shouldn't be hastily writing comments but I am interested in audio and wanted to share my experiences, naively.

I have some 'hires' recordings that I can't tell the difference between a CD at all and do not care for, and some where I hear more details (on very high end), more separation between instruments - and from what I'm reading it seems more like this is a mastering issue. The difference on some of these recordings enable a kind of subjective 'holographic' spatial effect in me (perhaps the cause of my emphatic response) and it seems I have probably falsely attributed this to the higher resolution as the factor.

nyolfen · on Oct 10, 2023

i appreciate your apology and i didn't mean it as an attack on you personally, it was just meant to highlight the irony. perhaps i could have phrased it more sensitively

jbverschoor · on Oct 1, 2023

These days "good equipment" unfortunately means:

- Sonos

- Airpods

- Beats

pimeys · on Oct 1, 2023

All with Bluetooth compression...

For that price range, Hifiman produces pretty good planar headphones. The edition XS sounds really good.

fsckboy · on Oct 1, 2023

things that are slightly louder "sound better". How did you control this sort of thing?

eviks · on Oct 1, 2023

Sure, but that's also easy to normalize in a proper test

ReactiveJelly · on Oct 1, 2023

Why AB and not ABX?

wuiheerfoj · on Oct 1, 2023

Because the base rate is 50% in an either/or test

squeaky-clean · on Oct 2, 2023

But you don't necessarily know they're guessing what you're testing for in the A/B test. If they are answering which one sounds better, some songs will genuinely sound better with a little lossy compression. Did you check to see if any audio samples deviated from the base 50% rate in either direction? For example if 70% of people chose the compressed version of Audio Sample 15, that still demonstrates an ability to discern a difference. It just turns out they like the lossy sample more.

For a contrived example, imagine an A/B test where you have to tell me which image has more red. Image 1 is a dark red panel on the left and a fully bright white panel on the right. Most people would say the left is more red, but in my fictional test it is actually the white panel because (100, 0, 0) has less red than (255, 255, 255).

If you use ABX, people know exactly what they are supposed to be matching.