Systemd, and the future of Linux init processing

By joe

November 29, 2014 - 6 minutes read - 1225 words

An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way. When you turn your machine on, and it starts loading the OS, init is the first process that runs, and it handles starting up all the rest of the system. Its important, and it needs to do its job well. We care about it for Scalable OS, as we take control the normal startup procedure to handle our use cases. We work well with init.d and a number of others right now. We’ll have to explore systemd a bit more, but in general I am not expecting anything earth shattering. This is in part, because the vast majority of what we see in the init type systems … well … for lack of a better phrase, just sucks. Linux has been transitioning to an event based system with udev for a while. Udev is a rule based mechanism to handle events, with a kernel and user space component. Woe be unto those whom mess with the way udev wants to work, as the scripts behind it are … broken … badly … . I say this as someone whom has tried, very hard in a number of cases, to fix the broken-ness. In many cases I’ve discovered its easier to ignore the broken section and add intelligence into the setup/config code to work around the udev brain death. Specifically, Linux does a great job at diskless booting. That is, until you share some directories. That udev needs. And assumes it has private copies of. So you have to work around that. You can’t fix udev … your fixes won’t be accepted upstream, and will just break at next OS update. So its easier to hack around it, and use a very light diversionary slight-of-bits touch to make sure it does what you want. Another great example is md RAID1 OS drives, and RHEL6/CentOS6. We actually had to hack the whole initramfs approach to work around the broken udev module that brings up raids (and failed to correctly bring up raids for the system to fully start up). Yeah, I know … open source makes it possible. Terrible implementation makes it necessary. Upstart was a little more sane, but still had some issues. init.d/rc in Debian 7 is reasonable, though we’ve seen still quite a bit of breakage. This all goes to the philosophy of the distro. Are they trying to be everything to everyone, or be a very well crafted system for a set of purposes. Too many want the former, not enough the latter. Scalable OS is all about the latter. Make it boot, easily (not quickly now, but thats coming), make it just work. Systemd promises to make startup in the init process different, pluggable (think udev and its horror), and so forth. We’ll have to play with it to see if it is mostly harmless or not. I suspect its going to cause at least little grief with our startup mechanism, so we’ll see if we need to work around it, or throw it away. During startup, many distros have concepts where they read (assumed local) configuration files to set up file systems, networks, functions. This is a lousy thing to do for clouds, clusters, etc. You really want a distributed control mechanism that provides these config options. Scalable OS has this implicit within it. But to get to this distributed control layer, you need network access. And this is where most distros are sheer and utter crap in their network setup code. We have a far better way built into Scalable OS, that was born of the frustration of dealing with the distros broken network config mechanisms. Generally speaking, you should never start a dhcp process on an ethernet port that doesn’t have a carrier present (after bringing the port up). Yet, this is exactly what most distros do by default. It gets even more interesting when you invoke udev, pci scanning in the kernel (done in a different order from a previous kernel, so items are discovered in a different order), so that some machines are absolutely unable to get back onto the network after a kernel update. Yeah, we’ve seen/experienced this. Quite common with RHEL/CentOS kernels updating to ours. And we’ve got work arounds to deal with udev when we need to. The question we have is, will systemd make this better? Worse? Not impact this? I suspect that the pci scan done by the kernel won’t change much, its simply how systemd will respond to this. We know how udev/init.d respond to things, and we’ve done our process change to remove the terrible/useless sections whereever possible. Though we still, on occasion get bit by udev race conditions. Udev is a piece of work. Fantastic for small machines without much stuff. Absolutely, completely borked for machines with lots of stuff. We see some occasional, fantastic, non-controllable race conditions in udev processes, that init is handling. My hope is that systemd is far smarter than its predecessors. I hate having to tell people that the solution to this seemingly mad system is to reboot it. Yet udev will drive you to this. Hopefully we can move past that. But, if not, we’ll do what we’ve done with the others, and work around it, disabling what gets in the way. That coupled with our configuration mechansim (not quite CoreOS like, we aren’t using etcd right now, but we’ll be evaluating it and other options … build vs “buy”), and we’ll be fine. Actually far better than any system that depends upon external mechanisms (like Chef, Puppet, et al) to configure machines post installation. Hardware is code, and should be automatically configured. This is what Scalable OS does, and I am hoping that systemd won’t get in our way. Even if it does, Debian just forked over this. So I am not worried at all. There are some folks saying this is the “end of Linux” or other such fluff. Not likely, but in the end, the operating system is an implementation detail (machine/container as code, its merely a configuration option). As long as I can use the hardware well, I am happy. Right now, for better or worse, Linux has the best driver support in market, albeit sometimes a maddening driver support (see binary only modules delivered by OEMs without a clue). There are other choices … I’d love to see better driver support for Illumos based machines, and *BSD (though these generally do have OK driver support). But we need Infiniband support, we need 40GbE and above support, we need memory channel storage and NVMe support for our customers. Limited choices there now. So systemd will be a challenge to get through, but I am not overly worried. I see the OS as a substrate upon which to run bare metal/containerized/VMed apps. Systemd shouldn’t impact that too much, and if it does, it will be swept away.