On Radio 4’s PM some time ago, there was some guff talked about music quality. It’s standard guff talked by people who call themselves “audiophiles”, but in fact know absolutely nothing about electricity, signals, or in fact any mathematics whatsoever.
They were very vague about it, but in essence the claim was that digital music is rubbish and vinyl music is great. This is nonsense. There is in fact no reason why either one of these is inherently better than the other (although there are good practical reasons why digital is better).
Today then, we’re going to be talking about sampling theorem, digitisation and quantifying noise. Feel free to step away now.
The world is almost entirely analogue. The word analogue is used to mean “continuously variable”. For example, for the temperature to have changed from 15°C to 20°C it must pass through all the temperatures in between; we can guarantee that the temperature was 15.67829281872716152°C at some time (even if we don’t know what time). Similarly, time is analogue; we can always break it apart. Half a second is 500 milliseconds, half a millisecond is 500 microseconds, half a microsecond is 500 nanoseconds, and so on. To get from 16:00 to 17:00 the time has to pass through 16:42.9282827165621. Has to.
For all analogue variables, if we were to plot them on a graph, our pen would never leave the paper. Each measurement is connected to the previous measurement, the line would be continuous. Hence we call these variables continuous variables.
Some things are not analogue. They can only take on a particular set of values. There is no in between. For example; “how many marbles are in this box?” (assuming we define marble to mean “a whole, non-broken, marble”). This variable can only take on whole number values. There can’t be 10.87 marbles in our box. “Is the front door open”? It can only be open or closed. “How many grains of salt are in this salt seller”? You get the idea. If we were to plot these variables on a graph, our pen must leave the paper. There is no inbetween. We call variables of this type discrete variables, because they can only take on discrete values.
In computing we can only deal with discrete values. There is a finite amount of memory in the computer, there is a finite precision to any number, and there is a finite resolution to any sensor. If we want our digital devices to interact with the real world, there therefore has to be a way of converting the discrete values of the computing world into analogue outputs; and vice versa.
Transducers convert one form of energy into another. For our purposes today we’ll be interested in those transducers which convert from electrical energy to and from point air pressure energy. In other words: microphones and speakers. The microphone converts air pressure (changes) into electricity. Remember that sound is nothing more than a pressure wave in a gas. This electrical signal is what engineers would call an analogue signal. The voltage (say) varies continuously as a function of air pressure — so while it has been changed in form it is still continuous.
With a few complications omitted for the sake of brevity, that electrical signal is used to drive a mechanical carving tool that physically alters a wax (say) master disc which is then used to produce a hard stamp for a press that can mass produce vinyl records. The carving tool, being physical passes through all positions from A to B, and each electrical connection on the way takes on any voltage from A to B. The whole chain is continuous.
Again brushing over some detail; we can convert a continuous signal into an discrete one by sampling. That is to say that we decide on a sampling period (say 20 microseconds), and then each period we note down the current analogue reading. There is a further complication: our method of measuring the analogue signal has finite resolution. While the source can vary continuously our output can only take on discrete values. For a three bit (one bit being capable of being zero or one) analogue to digital conversion, with maximum deflection of 1 volt, the outputs force our continuous input to take on one of 8 possible discrete outputs. This process is called quantisation.
- 000 -> 0 -> 0.000 volts
- 001 -> 1 -> 0.125 volts
- 010 -> 2 -> 0.250 volts
- 011 -> 3 -> 0.375 volts
- 100 -> 4 -> 0.500 volts
- 101 -> 5 -> 0.620 volts
- 110 -> 6 -> 0.750 volts
- 111 -> 7 -> 0.875 volts
So, for example, if our input were 0.9 volts, our output would be 7, and we wouldn’t be able to distinguish it from 0.875 volts. Similarly, 0.80 volts is indistinguishable from 0.75 volts. The more bits we add to our converter, the smaller the range of voltages that can be mistaken for each other.
We repeat this thousands of times a second, and end up with millions of these samples as a discrete sequence. Those samples get written digitally to a CD — i.e. the CD stores each number in its binary representation. This is done by burning pits into a surface with a laser.
Here’s the big lie that audophiles believe: since the digital samples can only take on discrete values, and the analogue system is continuously variable, the analogue system more accurately records the original input. Hence vinyl is better than CDs.
This would be true except for one thing we’ve missed out of the discussion. Noise. Noise is a random addition to our signal that adjusts it from its true value. When we think of signals we often think of them as being made up of these two components.
In an ideal world,
, which we measure, would be equal to
, the desired signal. It isn’t though,
the random noise is added to it. Being random, we can’t predict it, and have no way of subtracting it from our
. What we want then is that our desired signal be very large compared to the noise. You may have heard this value discussed as the signal-to-noise ratio, or SNR. This is (a function of) the ratio of
(if you really care, SNR is ten times the log of the power ratio of
Electrical noise is impossible to get rid of. It can never be reduced to zero. Noise is always present because electrical components aren’t perfect. Even if they were we would still have, for example, the noise added by the molecules vibrating in the wires because they are not at absolute zero temperature. We would have the noise of the air molecules banging against the microphone. Any wiring acts as a antenna in some shape or form and so is picking up any stray radio waves floating around.
Let’s go back to the digitisation process. We saw that quantisation meant that input values could not be perfectly represented by the output values. The difference between the true input and the presumed output is called quantisation noise.
would be the sample value if we had infinite precision.
is therefore a function of how many bits we assign to our
representation (and sampling hardware). The more bits, the smaller the quantisation noise.
What’s interesting is that quantisation noise is characteristically very similar to normal signal noise. We can, in fact compare it directly to our analogue noise:
is random and zero mean, so we can reasonably say that
). Now, it’s worth remembering that our input signal was analogue and came with its own noise. So it’s perhaps more accurate to write:
That is to say, our sample,
is made up of the true input plus a sample of the analogue noise plus quantisation noise. (This is actually a simplification, noise is characterised by its variance, and the variances add in quadrature, so total noise variance,
would be equal to
Here is where we can put pay to the audiophile lie then. If the quantisation noise is small compared to the analogue noise — small enough that it is negligible, then there is no difference whatsoever between analogue and digital. Given that we can choose (within reason) our quantisation noise (by sampling at higher resolution — i.e. more bits), our quantisation noise can always, potentially, be negligible.
(A quick note: while at first glance it appears that our period sampling of the analogue signal has thrown away all those analogue values that we never looked at; it doesn’t actually matter. By sampling what we have done is throw away bandwidth not information. Provided there was no information contained in that bandwidth of the analogue signal, the digital samples can perfectly reconstruct the analogue signal. We’d have to start talking about Fourier synthesis to understand this; and this article is already too long.)
If the digital signal can perfectly reconstruct the analogue, then the two are equivalent. One can’t be better than the other then, right? Right (neither analogue nor digital is inherently “better”). Nearly. “Nearly” because we have to bring in the practicality that we want to send/record our source signal from where it was generated to the place where it is wanted (your ears).
When noise is added to an analogue signal, the signal is damaged irreparably. When noise is added to a digital signal, within certain bounds, the signal may be reconstructed. This is because the digital signal is only ever ‘on’ or ‘off’, so the noise has to be very big to push an ‘on’ into something that looks like an ‘off’.
Let’s say we want to send a signal 100 miles. Every 25 miles, the signal has halved in power (noise never degrades, so relatively the noise is bigger — get your VHS out and record a recording of a recording and you’ll get the same effect). With digital we can place a repeater station at 25 mile intervals to detect and reconstruct a noise-free copy of the original. For an analogue signal, we can’t do this. The best we can do is amplify the signal before sending it on again. Unfortunately, that amplifies the noise as well. This is why the world is going digital — if you can get the signal through, then the original can be reconstructed perfectly. The key property that digital representations have is that they can be perfectly copied.
If we keep the digital path of the signal as long as possible and only convert from and to analogue as close as possible to the sources and destinations then we can keep noise at an absolute minimum. All the noise added during the transmission can be completely bypassed by using a digital representation. Provided we have sampled at high enough resolution (and frequency, but dragons lie the way of that discussion) then our quantisation noise will be negligible compared to the inherent analogue noise and reconstruction will be perfect.
There is a legitimate argument that CDs today don’t sound as good as old vinyl records. The reason it’s legitimate is nothing to do with analogue versus digital; it’s to do with monkey versus man. You see, being digital puts enormous powers to shape and manipulate the sound that were not available to analogue audio technicians. Unfortunately with great power goes great responsibility — and modern sound engineers (or more likely their record company bosses) misuse that power. They tend to turn the compression dial up full; they mess with the equalizer to optimise the sound power at various frequencies to match human hearing frequency response; and in general make a godawful mess of what they’ve got.
It’s all subjective though. Perhaps this pushing and prodding makes it sound better to more people. So be it, what do I care? However, if you find yourself thinking that it is CDs and MP3s that are to blame for why music sounds awful to you — think again.
I wanted to point out that I’m not arguing that every digital music device produces perfect sound. They are still products with varying qualities. A crappy CD player will sound worse than a good vinyl record player — it’s nothing to do with the CD or the vinyl though — it’s merely engineering quality. There is also the bizarre contradiction that since with digital it is harder to get it completely wrong; a great many manufacturers produce rubbish digital equipment because it’s “good enough”. If they had used the same standards for analogue equipment manufacture it would not have been “good enough”. Contradiction: digital music often sounds worse than analogue music because digital retains its quality more easily.
An article using similar engineering knowledge to debunk the 24 bit, 192 kHz myth.