|
XO Wave: A Bit about Dither
XO Wave Users: If you are using XO Wave for mastering CDs
or creating podcasts, you probably won't need to
know any of this because XO Wave's default dither settings are appropriate for
16-bit audio. If you are using high-end hardware, or producing anything
other than 8-bit, 16-bit and QuickTime files, please read
this page or just skip to the Rules of Thumb
for XO Wave Users below, to "cut to the chase".
Dither is a complex subject, and although you won't become an expert
just by reading this page, hopefully you can learn everything
you need to know to use dither well and make great
recordings. This introduction tries to use math as little as
possible. For more detail, you might want to
check out these links:
- Nika Aldrich wrote a very good description of dither here.
Aldrich spends some time with oscilloscopes and
waveforms to show how dither works and explain why it is
useful.
- Digital Domain has an excellent page on dither here.
This page goes into some detail and is well worth the
read.
- To find out how dither is used in XO Wave, see our technical
note Dithering in XO
Wave. You may also want to see our related technical note
Resolution in XO Wave.
Analog Distortion
Distortion in analog equipment, such as filters, tape recorders, and
mixers, is usually harmonic distortion. This type of
distortion occurs in many places around us, including our ears
and even the air, so we generally do not notice small amounts of
harmonic distortion. Harmonic distortion produces sounds that
are perceived to be "in tune" with the original audio, meaning
that harmonic distortion can be very hard to hear because the
added sounds blend in smoothly with the original sounds. Some
would argue that in small amounts harmonic distortion sounds
good, since it adds a slight color and richness to the sound. In
the extreme, of course, harmonic distortion can sound harsh and
unnatural, but at low levels and when properly used, it is
rarely bothersome.
Digital Distortion
In digital equipment, distortion is very different. As long as the
resolution is maintained at a high enough level, it is
possible reduce distortion to the point where for all practical
purposes there isn't any. Audio resolution is usually described
in terms of the number of digital bits used to represent each
sample. XO Wave, for example, uses at least 32 or 64 bits
(depending on your choice of audio
engines) to represent samples internally. Audio CDs, on the
other hand, only use 16 bits per sample. (Of course, this begs
the question: why do we need 64 bits of resolution in XO Wave
when we are just making 16-bit CDs? We'll see the answer to this
below.)
Unfortunately, at some point, such as when burning a CD or sending
files over the Internet, we usually need to reduce the
resolution of the audio, either because the medium does not
support a higher resolution or because we want the files to
transfer in a reasonable time, and not take up too much space on
disk.
When we reduce the resolution of an audio signal, we inevitably throw
away some detail. The reduction is normally done by rounding off
the less significant data; the lost information is called
round-off error, and even though we've actually removed information
from the audio, it sounds like we've added noise.
When the signal is loud, this is not a problem
because the error is small and independent of the signal, but
when the signal is small, the error is correlated with
the signal in a way that sounds unnatural. Because this
inharmonic distortion is not always well-managed on
digital equipment, some people describe digital audio as cold
and sterile.
The Solution
Reducing the resolution of digital audio signals results in round-off
noise that is correlated with the signal in a bad way. If we
were to listen to this noise alone it would sound like a
distant, dissonant relative of the original. The relationship
between the noise and the original signal is not considered
"consonant" or "musical" and so, even at low levels, it can be
audible and unpleasant.
We cannot eliminate the round-off noise and still reduce the
resolution of the audio, but it is possible to de-correlate it
from the original source audio. To do this, we add just enough
noise to disrupt the relationship between the original signal
and the round-off error. When used correctly, this noise, called
dither, de-correlates the round-off error from the
source audio. The important thing is to use dither
before reducing the resolution, because once
resolution is reduced, it's too late for dither to help.
Some have misunderstood dither as "covering up" the negative effects
of round-off error, but that's not quite right. Dither, when
carefully applied, actually eliminates the problem: instead of
having noise that sounds like an awful version of our original
signal, we have noise that sounds like our dither, which is
usually much more pleasant.
What's more...
Very quiet sounds may not be represented at all in your final, low
resolution audio, and are therefore lost, the same way that
numbers between 0 and .5 are "lost" when rounded. With dither,
however, the quiet sounds can be combined with dither 'sounds',
to keep them audible. This means that dither actually increases
the dynamic range of your audio, so well-done dither can retain
detail in the signal that would otherwise be lost.
Some people simply do not believe this: after all, it seems like a
contradiction to add
noise to a signal and actually improve its dynamic range, but
it's true. To demonstrate this, here are two 8-bit audio clips
of a song with a linear fade-out. They were both produced from the same
source, but the second one has been correctly dithered, and as a
result the song is audible longer and the fade is smoother:
Fade-out without dither: 8-bit
WAV file.
Fade-out with dither: 8-bit WAV
file.
The first example sounds fine when the signal is strong, but by the
end of the fade-out, the audio starts sounding "chunky" and it pops
in and out. The second example, which has been dithered, has a
smoother and more even fade-out.
Still don't believe it? Try it yourself by exporting your XO Wave
session as an 8-bit file with and without dither. (Another
common method of demonstrating this is to play a sine wave
fading out. It disappears abruptly without dither, but if you
use dither it keeps on going and you can actually use a
stop-watch to measure the difference. Try it!)
Why Use High Resolution?
You might be wondering, then, why XO Wave uses such high resolution
if we are just going to lose it all when we create 16-bit output
files. The reason is that digital audio processing consists of performing a
large number of arithmetical operations on each sample. Sometimes
hundreds or even thousands of operations are performed on a given
sample. If we use low resolution for processing, we are forced to
round off our numbers repeatedly, adding noise each time.
By working with higher precision (32 bits),
we can avoid most of this quality loss. By using even higher
precision (64 bits, available in XO Wave Pro), it is
possible to reduce the quality loss
even further. A simplified example illustrates
the problem.
Imagine we have a sequence of numbers that represents our audio:
{ 0, 15, 36, 40, 12, -7, -23 ... }. Now let's say we want to increase the
volume by 10% (about .8 dB). To do this, we multiply each number in
our series by 1.1, and get this:
{ 0, 16.5, 39.6, 44, 13.2, -7.7, -25.3 ... }. Looks fine, right? The
problem is that if we are representing our audio with 16-bit
integers (by far the most common 16-bit representation of
audio), we can't represent those numbers and we are forced to
round off, so we are left with this: { 0, 17, 40, 44, 13, -8, -25 }.
While this example is somewhat contrived,
it represents a very real type of problem which crops up in audio
processing. The sad truth is that many audio editing packages suffer from
exactly the problem just described. For example, some packages offer
a "normalize" function which takes in a 16-bit file, multiplies
each sample by a non-integer value, and produces a 16-bit file
result. Similar methods are used in major commercial packages for
everything from creating cross-fades to applying effects.
The error from one such operation is generally fairly
small, but the accumulated errors from multiple
operations can significantly degrade the quality of your audio produced this way.
If XO Wave used a low resolution internally, it would have to round
off and lose data after each and every operation. The round-off
error would accumulate until the audio signal was lost in a sea
of noise and inharmonic distortion. The distortion can be
avoided by adding dither before each round-off, but we would
still be left with the noise, and we might want to perform hundreds
or even thousands of operations, without losing the signal.
When Should I Use Dither?
If you are using XO Wave, you can skip to the next section to find out more about
using dither in XO Wave. If you are using another piece of software or hardware
for audio processing, here are some tips.
Use dither when...
- You are reducing the resolution of audio. For example, if
you are converting a 16-bit file to an 8-bit file, make sure
you are adding dither before the conversion.
- You have finished processing an audio sample and are returning
it to its previous resolution. For example,
if you are using a program that takes a 16-bit file in,
applies an effect, and creates a new 16-bit file with
the result, make sure that program dithers properly.
Don't use dither when...
- You are not reducing the resolution and you have not done
any processing. For example, if you are creating a 16-bit
file by looping another 16-bit file, you probably
won't need dither.
Here are some simple tips for using dither in XO Wave. If you want
the nitty-gritty on how dithering is performed in XO Wave, see our
technical note on Dithering in XO Wave.
- File Export: When exporting to a file, you should use
dither if it is available; this is the default behavior.
- Playback: In general, you should set the dither in the Hardware Settings
window as follows:
- If you are using 8-bit hardware, or wish to roughly
simulate 8-bit output, select "Dither to 8 bits". Note
that nowadays 8-bit audio hardware is uncommon.
- If you are using 16-bit hardware, or wish to roughly
simulate 16-bit output, select "Dither to 16 bits".
- If you are using higher resolution hardware, select "No
Dither".
- If you are using another piece of software such as
Audio Hijack Pro or Jack to process
audio after it is produced by XO Wave, you should
select "No Dither". In this case, you should also
make sure that the other software performs proper
dithering as needed.
FYI: Jack's "shaped"
dither appears to be most comparable to
XO Wave's dither. Audio Hijack
Pro can provide dithering
through additional plug-ins; XO Audio
has not tested this dithering.
- If your session's sample rate does not match a sample rate
provided by your hardware, Mac OS X will automatically
perform a sample-rate conversion on XO Wave's output.
In this case, the effectiveness of XO Wave's dither may
be reduced, especially if the sample rate conversion
is extreme. You will have to experiment to see if the
output is better or worse with dither. Exported files
are not subject to sample-rate conversion by the OS. (Note that ALSA, Jack and
OSS may also be configured to perform sample rate
conversions, or other processing which reduces or eliminates
the usefulness of XO Wave's dither).
- If you require bit-transparency, select "No Dither". An example
of this is if you are editing a 16-bit file without using any
effects (including volume), cross-fades, or mixing.
However, if you change your mind and decide to
add an effect or even just a little volume,
you'll want to be sure to use dither.
-- Bjorn Roche
|