Real-Time Ubuntu 18.04, part I: kernel
The first in a series of posts about my adventure into the realtime Linux world.
Instruction-esque stuff inside, if you dare to follow along.
About this post
In this post series I’ll cover my adventures of setting a fully preemptible Linux-RT powered Ubuntu 18.04 for an industrial-grade embedded-ish PC.
I’ll try to keep it instructional-ish (lots of those ‘ishes, huh?), so you can easily adjust the process for your requirements.
Also, this method should work more or less the same on all Debian and Debian derivates: Debians, other Ubuntu flavors and versions, Linux Mint and so on.
Why embedded-ish?
It uses regular parts (6th gen Intel CPU, SO-DIMM RAM etc) but is 100% passively cooled, has higher working temperature range, sealed chassis and so on.
On the other hand, it’s much bigger than your typical Raspberry Pi-like SBC and offers some desktop/server-like features: embedded RAID controller, 3 digital video outputs, 4 USB 3.0 ports, integrated 8-port Gigabit Ethernet switch with PoE, few internal mPCI-E connectors and so on.
Why real-time at all?
Because it’ll serve as a navigation/motion controller for an autonomously navigating robot of quite substantial size (think half a metric tonne of weight).
It’ll have to e.g. send signals to various microcontrollers via a CAN-bus with very high timer accuracy so that higher-level systems can assume timely execution of their commands.
It’s possible that it’ll also have to handle some digital IOs for e.g. handling some other, non-CAN signals from/to other parts of the robot (the PC is already equipped with some GPIO).
Gee, that’s cool and all but what does real-time mean here? Is it edible? How’s that’s different from my machine with regular desktop OS?
It’s not really edible, no, but it sure can bear fruits!
Even if of somewhat other type…
Real-time here means that, if I want a program running on this PC to have exactly timed behavior, it can.
My favorite comparison is the office building I’m working in.
My office is on the 1st floor and there is a lift and a staircase allowing me to get there, albeit they’re situated on opposite sides of the building.
The lift is closer to where I arrive from, it’s usually quicker and I don’t have to climb any stairs - that’s the regular desktop OS.
The staircase however require circling the building and a bit of climbing - that’s the RT OS.
Now, what if someone calls the lift to the 5th floor just before me and blocks it there?
In summary - the staircase (RT) is usually slower and sometimes a bit of a hassle compared to the lift (non-RT), but it always takes the same amount of time and that’s something very desirable in certain situations.
Prerequisites
After that excessively prolonged intro, let’s dive right into the technical stuff.
First, do a regular system installation.
As my system will work as an embedded one, I’ve chosen Ubuntu 18.04 Server (the newest was 18.04.2 at the time) for that task - no X server, not much BS and it even prompts to install OpenSSH server within the installer itself.
I’ve purged the system of any unnecessary stuff later on, but going with something minimalistic at start helps quite a lot.
After installation all I’ve done was updating the system and enabling color prompt for bash (come on, that should be the default, give me some eyecandy):
sudo apt update
sudo apt upgrade
sed -i 's/#force_color_prompt/force_color_prompt' ~/.bashrc
Linux-RT fully preemptible kernel
The first step into the RT world is an environment that allows preemption.
For Linux there are but a few choices: Linux-RT, Xenomai, RTAI.
There’s a good write-up comparing these available here: http://linuxgizmos.com/real-time-linux-explained/
For me, Linux-RT was the way to go, as it doesn’t require to create applications with a specific RT approach in mind.
Preparation
Packages
First I had to install few packages required for building the kernel 2:
sudo apt install build-essential libncurses5-dev gcc libssl-dev bc flex bison git
Sources
The second stage was the preparation of the kernel sources to build.
They consist of:
- vanilla Linux kernel sources
- Linux-RT patches
For everything to go relatively smoothly, kernel and patches versions must match exactly.
To achieve this, first I went to the official linux-rt wiki and determined the version of the latest non-development RT patches.
The patch versions are listed in the right-side panel (Actively maintained PREEMPT_RT versions
).
For me (around May 2019), the latest non-development version listed was 4.19-rt
and after opening it I’ve determined the exact version to be 4.19.37-rt20
.
After determining the versions all that’s left is to download and unpack the sources and the patches and to apply the latter to the former:
mkdir -p ~/linux-rt
cd ~/linux-rt
wget https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.19.37.tar.xz
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/4.19/patch-4.19.37-rt20.patch.xz
xz -cd linux-4.19.37.tar.xz | tar xvf -
$ cd linux-4.19.37
$ xzcat ../patch-4.19.37-rt20.patch.xz | patch -p1
Kernel configuration
Configuration scripts automatically pick up the running kernel’s configuration and use it as a base.
Therefore everything I did that was necessary was opening the configuration menu:
make menuconfig
and enabling the preemption:
General setup
Preemption Model
Fully Preemptible Kernel (RT)
That’s the required stuff.
I’ve additionally slimmed the kernel down by:
- disabling all debugging things I could’ve found
- disabling all unnecessary driver modules (e.g. exotic FS drivers, exotic/legacy hardware drivers etc)
Build and install
This one is quite straightforward so here goes the no fluff version:
make -j`nproc` deb-pkg
cd ..
sudo dpkg -i linux-*.deb
Okay, there’s some important fluff going on here: I’ve built the kernel and installed it using the Debian package.
This will NOT work on non-Debian-based distros.
You’ve been warned.
Now, reboot and with some luck (as in: bootloader, be it Grub or whatever, picking up the proper kernel), your system should boot to RT-enabled kernel!
Fine-tuning
There’s always a lot of room for improvement.
The above method skips even some simple ones, like disabling unneeded modules.
Or one of the more important ones, as in disabling pluggable module support.
While locking down the kernel compile-time it prevents anything from loading up (or being dismissed) unexpectedly while it’s booted.
Purging the system
I required a minimalistic, BS-free system, thus I’ve disabled most of the built-in services, features and fluff.
First came things that pop-up during boot and take oh-so-long to load or even fail due to timeouts: cloud-init
and snapd
snappy daemon.
Then came other unneeded services - to identify them I’ve used htop
(tree view [F5
] is a very neat feature), systemctl status
and systemctl --failed
.
Cloud-init
Disabling cloud-init
, according to the docs 3, should be fairly straight-forward:
sudo touch /etc/cloud/cloud-init.disabled
sudo apt purge --auto-remove cloud-init cloud-guest-utils
Snapd
Snapd
, the snappy daemon, is composed of several services, sockets, timers and mount targets.
Thus disabling it includes disabling oh-so-many targets.
Note: snap-core-6350.mount
target might have a different name for you - check with systemctl status
.
sudo systemctl disable snapd.service
sudo systemctl disable snapd.autoimport.service
sudo systemctl disable snapd.core-fixup.service
sudo systemctl disable snapd.seeded.service
sudo systemctl disable snapd.snap-repair.service
sudo systemctl disable snapd.snap-repair.timer
sudo systemctl disable snapd.system-shutdown.service
sudo systemctl disable snap-core-6350.mount
sudo systemctl disable snapd.socket
sudo apt purge --auto-remove snapd
Networking
This part varies very much for different distros, flavors or even desktop-targeting versions of Ubuntu - those might use NetworkManager without netplan or other configurations.
For me, I’ve disabled the netplan
and enable ifupdown
, allowing simple network configuration via /etc/network/interfaces
.
My approach is based on this post from AskUbuntu 4.
Of course, replace the interface names and IPs according to your needs and network configuration.
Note: you might want to additionally set the Domains
parameter in /etc/systemd/resolved.conf
.
The complete command list was as follows:
sudo apt install ifupdown
sudo sh -c "\
echo 'source /etc/network/interfaces.d/*
\nauto lo
iface lo inet loopback
\nallow-hotplug enp1s0
auto enp1s0
iface enp1s0 inet dhcp
\nallow-hotplug enp0s31f6
auto enp0s31f6
iface enp0s31f6 inet static
\taddress 10.0.0.10
\tnetmask 255.255.255.0
\tbroadcast 10.0.0.255
\tgateway 10.0.0.1
\tdns-nameservers 9.9.9.9 8.8.8.8
' > /etc/network/interfaces2"
sudo sh -c "ifdown --force enp0s3 lo && ifup -a"
sudo systemctl unmask networking
sudo systemctl enable networking
sudo systemctl restart networking
sudo systemctl stop systemd-networkd.socket systemd-networkd.service networkd-dispatcher systemd-networkd-wait-online
sudo systemctl disable systemd-networkd.socket systemd-networkd.service networkd-dispatcher systemd-networkd-wait-online
sudo systemctl mask systemd-networkd.socket systemd-networkd.service networkd-dispatcher systemd-networkd-wait-online
sudo apt purge --auto-remove nplan netplan.io
sudo sed -i 's/#DNS.*/DNS=9.9.9.9 8.8.8.8' /etc/systemd/resolved.conf
sudo systemctl restart systemd-resolved.service
Other services
I’ve disabled all unnecessary services:
- ureadahead - it depends on some of the disabled debugging parts in the kernel:
- systemd-timesyncd - the network time synchronization daemon - I don’t need it and I don’t want it to start spamming the network in random moments
- LXD, LXCFS - these are the Linux Containers support services (daemon, filesystem etc) and I won’t use containers anyway
- unattended-upgrades & apt-daily - quite self-explanatory
- LVM - no logic volumes on this machine!
- ATD and CRON - task schedulers
- Polkit, ufw, apparmor - policy management, firewall, app security stuff
- fstrim, rsync, iscsi* - filesystem stuff (one
sync
can kill the RT) - journald, syslog, apport - system journal/log (I might reenable these for debugging/development), crash reporting
- backlight, motd, vm tools, remote filesystem - some minor unneeded stuff
For ease of copy-pasting, here’s the complete list of commands I’ve used:
sudo systemctl disable ureadahead.service
sudo systemctl disable systemd-timesyncd.service
sudo systemctl disable lxcfs.service
sudo systemctl disable lxd-containers.service
sudo systemctl disable lxd.socket
sudo systemctl disable unattended-upgrades.service
sudo systemctl disable apt-daily.timer
sudo systemctl disable apt-daily-upgrade.timer
sudo systemctl disable lvm2-lvmetad.socket
sudo systemctl disable lvm2-lvmpolld.socket
sudo systemctl disable lvm2-monitor.service
sudo systemctl disable atd.service
sudo systemctl disable cron.service
sudo systemctl disable polkit
sudo systemctl disable ufw.service
sudo systemctl disable apparmor.service
sudo systemctl disable fstrim.timer
sudo systemctl disable rsync.service
sudo systemctl disable iscsi.service
sudo systemctl disable iscsid.socket
sudo systemctl disable systemd-journald.socket
sudo systemctl disable systemd-journald.service
sudo systemctl disable systemd-journald-audit.socket
sudo systemctl disable systemd-journald-dev-log.socket
sudo systemctl disable rsyslog.service
sudo systemctl disable apport-autoreport.path
sudo systemctl disable apport-forward.socket
sudo systemctl disable systemd-backlight@backlight\:intel_backlight.service
sudo systemctl disable system-systemd\\x2dbacklight.slice
sudo systemctl disable motd-news.timer
sudo systemctl disable open-vm-tools.service
sudo systemctl disable remote-fs.target
Additional configuration
As the networking now is quite simplistic, in order for the system to recognize itself its hostname had to be added to /etc/hosts
:
sudo sh -c "echo '127.0.1.1 `hostname`.localdomain `hostname`' >> /etc/hosts"
That’s all the initial configuration I did - of course, more will be needed down the road.
Testing
First, a reboot is obviously required to apply all the stuff described above.
For testing I’ve used the cyclictest
utility from the rt-tests
package.
This test means nothing without a stress, so for that I’ve smashed the machine with stress
going wild on CPU (-c
), memory (-m
), disk (-d
) and IO (-i
), all of those in as many threads as the CPU can deliver.
Additionally I’ve repeatedly cloned the whole linux kernel from another shell (stresses IO, network, CPU and RAM).
It went something like this:
sudo apt install rt-tests stress
sudo cyclictest -a -t -n -p99
sudo stress -c 8 -i 8 -m 8 -d 8
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git
After few hours of stress-testing the latencies maxed at about 40us.
To put that in perspective, a similarly-spec’d machine with regular desktop Xubuntu peaked above 1ms (1000us) within a minute of starting the test.
I call that a win!