Ah, but we aren't talking about SNR, we're talking about dynamic range. Not quite the same thing ;). Music isn't a broadband noise source, so you get some extra perceptual headroom for free.
-
-
Let's sum it up, to see if we agree: 16 bit w/out oversampling allow for about 96 dB SNR. Using dithering you can have more than 96 dB SNR by taking advance of human hearing characteristics (an 'A' frequency weighting filter can be used for computations).
1 reply 0 retweets 0 likes -
Other than that, you can use oversampling to gain more SNR, but that requires increasing Fs beyond nyquist.
1 reply 0 retweets 0 likes -
You have to be careful though, the lower the Fs is, the more stringent requirements will be put both in the anti-aliasing and the reconstruction filters. Both must be analog and will not be the ideal brickwall filter.
1 reply 0 retweets 0 likes -
Fs only needs to be strictly above 2*Nyquist for whatever you want to encode. Oversampling is a practical requirement for ADCs and DACs as an *implementation detail*. The end result is still a black box that takes analog in and spits 48kHz out.
1 reply 0 retweets 0 likes -
The fact that the *practical* way to make a near-ideal ADC or DAC is by resampling/filtering digitally first is irrelevant. That does not change the fact that 44.1kHz/48kHz digital audio is and always will be sufficient to encode all the information.
1 reply 0 retweets 0 likes -
This has nothing to do with dynamic range. Dithering is required in order to have *consistent* quantization noise (a flat white noise floor) and to be able to encode the content with zero distortion. Dithering per se is not a perceptual hack.
1 reply 0 retweets 0 likes -
Our perception of audio is in the frequency domain, not the time domain. This means that even though, say, a 1kHz tone at -100dB is *below* the broadband noise floor of 96dB, it is *above* the noise floor in a narrow band around its frequency.
1 reply 0 retweets 0 likes -
That means we can perceive it as above the noise floor. Since music isn't a flat broadband noise source but strongly tonal, this means that the effective dynamic range of 16-bit PCM is *higher* than 96dB, because we can encode the information content above the noise floor.
1 reply 0 retweets 0 likes -
This is all *without* shaped dither. Shaped dither *further* increases dynamic range by taking advantage of the frequency-dependent response of the human ear, to lower the noise floor in the bands where it most matters.
1 reply 0 retweets 0 likes
This is all graphically easy to see on a spectrum analyzer, which is a good visual tool to look at audio the way our ears perceive it. That's why the noise spectrum of dithered silence at 44.1kHz/16 is not -96dB, but significantly lower (depending on FFT parameters).
-
-
@xiphmont has a footnote in the article I linked that notes how with an infinite window size, the dynamic range is effectively infinite; of course our ears don't have an effectively infinite window size. We can come up with a representative approximation for perceptual purposes.1 reply 0 retweets 0 likes -
Ultimately the real point is: the dynamic range of our ears is defined as the difference between the loudest (ear-damage level) sound and the quietest sound *we can perceive*. Since *we can perceive* a sound encoded at <-96dB in 16bit PCM, that is *not* the dynamic range.
1 reply 0 retweets 0 likes - Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.