This has nothing to do with dynamic range. Dithering is required in order to have *consistent* quantization noise (a flat white noise floor) and to be able to encode the content with zero distortion. Dithering per se is not a perceptual hack.
-
-
Our perception of audio is in the frequency domain, not the time domain. This means that even though, say, a 1kHz tone at -100dB is *below* the broadband noise floor of 96dB, it is *above* the noise floor in a narrow band around its frequency.
1 reply 0 retweets 0 likes -
That means we can perceive it as above the noise floor. Since music isn't a flat broadband noise source but strongly tonal, this means that the effective dynamic range of 16-bit PCM is *higher* than 96dB, because we can encode the information content above the noise floor.
1 reply 0 retweets 0 likes -
This is all *without* shaped dither. Shaped dither *further* increases dynamic range by taking advantage of the frequency-dependent response of the human ear, to lower the noise floor in the bands where it most matters.
1 reply 0 retweets 0 likes -
This is all graphically easy to see on a spectrum analyzer, which is a good visual tool to look at audio the way our ears perceive it. That's why the noise spectrum of dithered silence at 44.1kHz/16 is not -96dB, but significantly lower (depending on FFT parameters).
1 reply 1 retweet 0 likes -
@xiphmont has a footnote in the article I linked that notes how with an infinite window size, the dynamic range is effectively infinite; of course our ears don't have an effectively infinite window size. We can come up with a representative approximation for perceptual purposes.1 reply 0 retweets 0 likes -
Ultimately the real point is: the dynamic range of our ears is defined as the difference between the loudest (ear-damage level) sound and the quietest sound *we can perceive*. Since *we can perceive* a sound encoded at <-96dB in 16bit PCM, that is *not* the dynamic range.
1 reply 0 retweets 0 likes -
Remember the absolute threshold of hearing is defined for a *pure tone*, not broadband noise! It would be way higher for broadband white noise.
1 reply 0 retweets 0 likes -
I just did a little test, and I can perceive a pure tone at 800Hz at about -30dB relative to white noise. That means the dynamic range of a 16-bit PCM signal should be somewhere around 96+30 = 126dB. Shaped dither would improve this further.
1 reply 0 retweets 0 likes -
Understood, thanks for the explanation. Just one tiny remark. Ear response is logarithmic (both in amplitude and frequency), so instead of measuring the noise floor using an FFT, it would be more suited to use something like 1/3 octave frequency bands...
2 replies 0 retweets 0 likes
That's fair enough. And of course that also makes perfect sense in the context of shaped dither, where you want to have more dither power at higher frequencies.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.