For some time now I’ve not been happy with the playback quality of audio in leJOS on the EV3. Although the EV3 speaker is small we really should have been able to do better. The audio playback mechanism used in 0.8.1-beta and earlier is based upon the standard Lego audio playback kernel modules. It works by using a combination of the EV3 PWM (pulse width modulation) hardware and a high resolution timer to provide playback using a fixed sample rate (8KHz) and a fixed sample size (8 bits). It works but the results are not great. Here is a recording of the EV3 playing an audio clip[audio https://googledrive.com/host/0BwTg7xhdb1rYZmlQYk5WRUlFeWc/sip0.8.1.wav ]
Not too bad but a little shaky and distorted. To really hear what is going on, we need a simpler purer sample. Here is a recording of the EV3 repeatedly playing a simple 3 note piano tune[audio https://googledrive.com/host/0BwTg7xhdb1rYZmlQYk5WRUlFeWc/lejos0.8.1.wav ]
You can hear how distorted the playback is and that there are noticeable clicks at the start and end of playing the .wav file.
Doing something about this has been on my “todo” list for some time, but I’d not found a good way to address the issues. Then a few weeks ago the ev3dev team posted a video showing the audio capabilities of ev3dev. Although the code was not directly usable in the leJOS Linux kernel some of the ideas certainly were. After a little bit of work in the kernel and kernel modules we get
[audio https://googledrive.com/host/0BwTg7xhdb1rYZmlQYk5WRUlFeWc/sipnew.wav ]
[audio https://googledrive.com/host/0BwTg7xhdb1rYZmlQYk5WRUlFeWc/lejosnew.wav ]
which are much better. As part of this work the supported formats have also been improved, you can now use rates from 8KHz to 48KHz with 8bit or 16bit samples.
So why is the new implementation better? The original code used the pwm hardware to feed a series of on/off pulses to the an audio amplifier. The duration of these pulses was changed at the sample rate (8KHz) and the pulse width was basically proportional to the sample value. Things were actually a little more complex as the pwm hardware actually generated 8 pulses per sample, which raised the sample frequency above that audible to the human ear (a technique called over sampling). Although in theory this is a good way to generate an audio waveform (given the limited resources available), the actual implementation had problems:
- The sample rate was too low for good quality.
- The sample size was too small
Increasing the sample size is relatively easy, but needs care because the value defining the maximum pulse length varies as the pulse rate changes and basically gets smaller as the rate increases (which in turn reduces the available resolution).
Increasing the sample rate is not so simple. The hires timer is generating an interrupt 8000 times a second. This imposes a load on the system and even at this rate the interrupts are not always delivered exactly on time. This leads to jitter in the way the samples are played and the ear is very good at detecting these sorts of problems. Increasing the sample rate to 16KHz and above would simply impose too much load on the system and the jitter would get much worse.
The solution used by the ev3dev team (and now leJOS), is to make use of more of the EV3 hardware, the fiq and additional features of the pwm unit. The processor chip used in the EV3 supports something called a fiq (Fast Interrupt reQuest), which is basically a very fast interrupt mechanism, that completely bypasses the standard Linux interrupt system. Using this allows for very low overhead high frequency interrupts that are always delivered on time. In our new solution these interrupts are generated not by a timer (as in the original implementation), which is hard to synchronized with the PWM hardware, but instead by the PWM hardware itself. The AM1808 PWM unit has a feature that will generate an ineterrupt after 1, 2 or 3 complete PWM cycles. This is used to trigger the fiq which in turn loads the new pulse width into the PWM unit. A combination of these two hardware features allow 16 bit samples to be fed smoothly to the PWM unit at rates between 24KHz and 64KHz, with very little intervention by the Linux kernel.