os.elk.audio

As a follow up to the first part of our four part blog series Audio Latency demystified, this post talks broadly about our experience with Real-time Linux approaches. In the previous post we briefly discussed the scheduling latency, and its effects on cpu performance in a conventional operating system like Linux. It highlighted the need to tweak Linux to make it suitable for real-time audio processing. In this blog post we shall discuss the finer details of latency in the Linux kernel, the mechanisms to solve it, and also discuss each method’s pros and cons. As always, hit us back with comments or write to us at tech@mindmusiclabs.com

Task scheduling Latency

One of the sources of latency induced by general purpose operating systems is the Task scheduling latency (or sometimes called Interrupt to task latency). The task scheduling latency is the time taken by the operating system to wake up a user-mode task waiting on an interrupt. It is the time elapsed between the interrupt trigger and the execution of the corresponding task. The image below explains the task scheduling latency as an accumulation of various latencies of the operating system.

It is the sum of Interrupt latency (the time that elapses from when an interrupt is generated to when the source of the interrupt is serviced), Interrupt processing (time taken by the interrupt service routine) and the time taken by scheduler in performing a context switch.

Standard Linux Kernel

There are numerous areas in the Linux kernel where the kernel is not preemptable or in simple terms, it doesn’t yield the CPU to other higher priority tasks such as an audio processing task. The Linux kernel is constantly being improved to make it more deterministic, and it is still usable with audio task scheduled with real-time scheduling policy (SCHED_FIFO) and higher buffer sizes, which result in large round-trip latencies. Today, a number of solutions for achieving real-time capability with Linux are available. These can be broadly classified into two categories: Linux kernel enhancements and Hypervisor based solutions (a Hypervisor is a piece of software that allows users to run multiple operating systems on the same hardware by virtualizing the hardware resources).

Linux kernel with PREEMPT_RT (Kernel enhancement)

One of the most popular approaches implemented to guarantee predictable latencies in Linux is the PREEMP_RT patch. This approach is an enhancement of the Linux kernel, and the main idea behind it is to improve the preemptability of Linux, and improve interrupt latencies. PREEMPT_RT is the most popular approach usually used (e.g. Ubuntu studio).

Linux kernel with PREEMPT_RT with Core isolation

On a PREEMPT_RT enabled kernel, the real-time audio task’s performance can be improved even further with a technique known as Core isolation. You can isolate one or more CPU cores on a multi-core processor to be excluded for scheduling by the Linux kernel. Thus, with the help of CPU affinity syscalls we can move a real-time audio task on such an isolated core. This helps in avoiding scheduling overhead between the context switches.

Pros:

Simple to use and very close to the mainline Linux kernel.
Tried and tested with the JACK server by many developers.
Close to being the industry standard right now.

Cons:

Larger and less deterministic latencies than hypervisor based solutions.

Jailhouse (Hypervisor based solution)

Jailhouse is a hypervisor built to run bare metal applications alongside Linux. Jailhouse is a resource partitioning hypervisor and it is more concerned with isolation of resources than virtualization. Jailhouse enables asymmetric multiprocessing (AMP) on top of an existing Linux setup and splits the system into isolated partitions called “cells”. A running Jailhouse system has at least one cell known as the “Root cell” or “Linux cell”. It contains the Linux system used to initially launch the hypervisor and to control it afterwards.

Pros:

Low deterministic latencies (Good for hard real-time requirements).
Vertical partitioning of software layers (Independent software stack for Linux and real-time application).
Safe mechanism to isolate real-time tasks from non real-time tasks (Sandboxing).

Cons:

Requires virtualization support from HW (VT-x / VT-d, AMD-V / IOMMU).
Not yet completely ready for industry (Supports few ARM boards).
Cannot reuse anything from Linux’s infrastructure for the real-time application (Device drivers, file system, networking, etc).
Suffers from cache thrashing when L2/L3 caches are shared between CPUs.
Need config files to be set up correctly for dividing resources (complicated).

RTAI (Hypervisor based solution)

RTAI stands for Real-time application interface. RTAI is also a hypervisor based solution, which uses interrupt virtualization offered by a thin abstraction layer known as “Adeos“ (Adaptive Domain Environment for Operating Systems). There are multiple entities called domains, which act as sandboxes within which operating systems are running independently. However, the operating systems in different domains can communicate with each other through the Adeos abstraction layer. Unlike Jailhouse, it is essentially a horizontal separation of software stacks for real-time and non real-time domains, with hardware resources being shared by all the domains (real-time and non real-time). Real-time tasks run in kernel space with the same privileges as the kernel.

Adeos uses an interrupt pipe to propagate interrupts through the different domains running on the hardware. It allows some domains to have the priority to receive the interrupts first, and pass them over to the next domain if they are not handled in that domain.

Pros:

Availability of APIs for developers to write real-time applications.
Availability of linux like services (thread management, event management, etc).
Lower latencies than PREEMPT_RT and much better determinism of real-time tasks.
Domains can communicate safely with each other.

Cons:

Real-time tasks run as kernel space threads (unsafe, and presents programming challenges).
There are no memory protection mechanisms, and real-time tasks are not protected from each other.
RTAI is licensed under GPL, thus application source code cannot be made proprietary.

Xenomai

Xenomai is a spinoff project from the RTAI. As in the case of RTAI, Xenomai also uses an interrupt abstraction layer called I-pipe. I-pipe is the new incarnation of Adeos. The main difference between I-pipe and Adeos, is that I-pipe is specifically written for Xenomai as its use case. Xenomai allows real-time tasks to be executed in user space.

I-Pipe

I-pipe inserts itself between the Linux kernel and the interrupt controller hardware. It delivers these events in the order of static priority of the domains. Hence, it is possible to provide timely and predictable delivery of events.

Xenomai Architecture

Xenomai enabled Linux architecture. Image: Radboud University, Nijmegen

The figure above shows the software architecture of a Xenomai patched kernel with Linux as a co-kernel. Xenomai and Linux are both registered as two domains over the Adeos based interrupt pipeline.

Xenomai offers a well furnished framework for developing real-time device drivers which run like any other linux kernel driver but safely in real-time domain.

Pros:

A good mix of horizontal and vertical separation of hardware resources. Each domain is sandboxed but share the underlying hardware and software parts.
Allows you to use any generic part of Linux that can be useful, for example the hardware abstraction layer (HAL).
Real-time applications can run in user space.
Provides a wide range of services for developers similar to POSIX APIs.

Cons:

Calling Linux syscalls while running in real-time domain can degrade performance.
Need to write RTDM drivers for the peripherals used in real-time domain.

Conclusion

To conclude, each approach for real-time Linux has its own challenges associated with it. Yet, we witness an increase in the use of Linux for real-time applications because new approaches are improving its real-time performance. You can check about Dovetail, an upcoming project based on Xenomai which tries to improve the dual-core architecture.

We will continue the Audio latency demystified blog series, stay tuned for the next post which will be about how to measure round trip audio latency accurately!

Audio Latency demystified, part II: Real-time Linux approaches

Task scheduling Latency

Standard Linux Kernel

Linux kernel with PREEMPT_RT (Kernel enhancement)

Linux kernel with PREEMPT_RT with Core isolation

Jailhouse (Hypervisor based solution)

RTAI (Hypervisor based solution)

Xenomai

I-Pipe

Xenomai Architecture

Conclusion

‍