Now that we know what sound is, let us turn our thoughts towards recording the sound and storing it on a computer. The first step in this process is to convert the sound wave into an electrical signal. When we use a continuous signal to represent another signal of a different quantity, we call it an analog signal or in the case of a sound wave, an analog audio signal. You are probably already familiar with the devices that perform this conversion:
Analog signals have many uses, but most computers cannot work with them directly. Computers can only operate on sequences of discrete binary numbers, also known as digital signals. We need to convert the analog signal recorded by the microphone into a digital signal, that is, digital audio, before the computer can understand it.
The most common method used to represent analog signals digitally is pulse code modulation (PCM). The general idea of PCM is to sample (or measure) the amplitude of the analog signal at fixed time intervals, and store the results as an array of numbers (called samples). Since the original data is continuous, and numbers on a computer are discrete, samples need to be rounded to the nearest available number, in a process known as quantization. Samples are usually stored as integer numbers, but it is also possible to use floating-point numbers as shown in the following example:
There are two ways to control the quality of the sampled audio:
Sampling rate: Also known as the sampling frequency, it is the amount of samples taken for each second of audio. According to the Nyquist sampling theorem, the sampling rate should be at least twice as high as the highest frequency of the analog signal, in order to allow a proper reconstruction. You will usually work with values of 44,100 Hz or 48,000 Hz. The following figure compares sampling at different rates:
Bit depth: Also known as the resolution, it is the amount of bits used to represent a single sample. This controls the number of possible discrete values that each sample can take, and needs to be high enough to avoid quantization errors. You will usually work with bit depths of 16 bits or 24 bits, stored as integer numbers, or 32 bits stored as floating-point numbers. The following figure compares sampling at different resolutions: