As we saw from the beginning, audio augmentation is a challenging topic without hearing the audio recording in question, but we can visualize the techniques’ effect using Waveform graphs and zoom-in charts. Still, there is no substitution for listening to the before and after augmentation recordings. You have access to the Python Notebook with the complete code and audio button to play the augmented and original recordings.
First, we discussed the theories and concepts of an audio file. The three fundamental components of an audio file are amplitude, frequency, and sampling rate. The measurements of unit for frequency are Hertz (Hz) and Kilohertz (kHz). Pitch is similar to frequency, but the unit of measurement is the decibel (dB). Similarly, bit rate and bit depth are other forms expressing the sampling rate.
Next, we explained the standard audio augmentation techniques. The three essentials are time stretching, time shifting, and pitch scaling. The others are...