Skip to Content

Daemon-ize your processes on the cheap!

Or, Death to /etc/rc.local!

This post should be treated as an historical artifact. It probably contain broken external links and it may no longer reflect my views or opinions.

TL;DR, or The Exec­utive Summary: Init scripts are hard. Here’s a bunch of UNIX back­ground backing up an argument for using stand-alone Process Super­visors whenever you need a new instance of a custom daemon spun up.

The setup

Picture. if you will, a pile of code.

Yeah, that’s good. The software equiv­alent of THAT mess.

If you write software, good odds that you’ve written at least one of these steaming sadness piles. If you work in oper­a­tions, there’s better odds that you’ve been handed at least one failure-pile (this week). You or someone a lot like you needed a message consumer, some new hotness message bus, a stand-alone process that just listens for specific connec­tions and commands on some random port or socket, or just some long-running process that runs non-interactively in the back­ground (this being the very defi­n­ition of daemon, by the way) and now this thing needs to run in production. In the wild. In the world at large, with other, nicer or bigger daemons grinding up page-to-page with it.

Invariably, the commu­ni­cation or docu­men­tation attached to this pig-pile of meadow-muffin-code almost always looks like this:

“Oh, just execute this and Bob’s Your Uncle, we’re all set! Just stick it in /etc/rc.local!”

LD_LIBRARY_PATH=/some/crazy/nonstandard/path \
    /path/to/some/executable \
      OPTION OPTION OPTION OPTION OPTION >> /dev/null 2>&1 &

Yeah… we’re not doing that. Why? For so, so many good reasons – redefining global variable scope, no PID tracking, no management, it gives me hives, because I said so, etc. ad infinitum.

Ignore the insane number of posi­tional argu­ments1 that were handed down in the instruc­tions for running our newly minted buffalo chip and consider instead this nugget of hard truth: you can pretty much plan on every process it starts being killed on reboot by having QUIT, TERM, or KILL sent to it uncer­e­mo­ni­ously – that may or may not be a deal breaker right there because this appli­cation may well not have any exit handlers defined; QUIT might well leave it in an incon­sistent state since it means “quit and dump core”. It doesn’t provide any mech­anism of tracking the PID beyond judi­cious grep-ing and it doesn’t provide a simple status lookup mechanism.

What about rc.local?

I think that use of rc.local these days is also sort of sloppy; it’s basi­cally an admission that you can’t be bothered to implement any sort of process management framework for daemons that you consider important enough to start every time the system is booted.

Well, what are my options?

At this point, you can either write an init script, daemonize your code2, or set up a Process Super­visor. Daemo­nizing is inter­esting if you’re the developer who wrote the Cleveland Steamer, but it’s out of scope if you’re just the poor bastard running Ops. I’ve beaten the init script drum for a long time but after years of fixing substandard, poorly written init scripts I’ve realized that writing a good init script is hard:

You can only do so much (sanely and safely) in Shell.

Shell func­tions suck. True story. Bourne Shell and bash func­tions can only return process status (so basi­cally just integer values). If you want to get some sort of data from them you’re echo-ing to STDOUT and capturing the output. That’s weak sauce. tcsh doesn’t do func­tions and zsh probably isn’t installed on your machine unless you specif­i­cally put it there. So look forward to limited string and interger handling abil­ities and a lot of roll-your-own string construction.

Using ksh? You’re adorable.

Fun fact: writing an init script in some­thing other than a shell language is frowned upon – once in a while you come across an init system that forces its dependant scripts to run through /bin/sh. Not often, but some­times. So remember, just because you can do some­thing doesn’t always mean you should.

Shoving a process to the back­ground (sanely and safely) is tricky.

If an appli­cation doesn’t properly detach and back­ground itself then you’re relying on the good old ampersand and reliable old nohup to get your cow pie off the console and running when the TTY hangs up.

Dropping priv­i­leges (sanely, safely, and correctly) is also tricky.

I’ve seen a lot of easy-to-diagnose-in-hindsight mistakes made with sudo and su in home-brew init scripts… dropping priv­i­leges incor­rectly is probably the most common error I see after output redi­rection errors (which are RAMPANT).

If you’re looking for more detail, here’s a great Stack Overflow conver­sation about that.

Multiple instances are not tricky – they’re just plain hard or stupid.

If you need to run the same code base in four separate instances with a different argument passed to each instance, you’re basi­cally rolling your own loop around reading the contents of a .d directory some­place in /etc and writing config­u­ration snippets (my preferred method), or dupli­cating the same init script four times and changing that value by hand in each instance. I am not really a fan of this method – every time we have to roll our own anything or edit some­thing in place we introduce room for a whole host of new bugs that QE probably won’t catch.

Those are all bummers

I know! That’s what brought me around to the Process Super­visor school of thought! A Process Super­visor is a very robust (and usually very small) daemon that gets started as part of the standard init process but then manages its own defined list of dependent appli­ca­tions from that point on. It’s a bit like xinetd or daemontools, except that they’re usually a little more robust and flexible than xinetd (because they don’t sit around waiting for incoming connec­tions to initialize and spin up depen­dents) and they’re usually not littering my man hier 7 with super­fluous bits and bobs (…they’re probably also still supported). A Process Super­visor like God, Super­visor, or Monit will also handle local process moni­toring for you. That means you get PID tracking, frozen process management, process reaping, process spawning, and service restarting for free. And it’ll do it all without having to write brittle init scripts and cronjobs to check process status every 5 minutes. We hates fragile, brittle cronjobs so much.

I’ve also begun to see tremendous main­te­nance and support value in divorcing the init system from the sun-baked mud-bricks of PID 1 entirely – Process Super­visor dependent config­u­ra­tions will migrate almost seam­lessly if you’re siloing your appli­cation stack (inter­preters, libraries, appli­cation code). This means that they can be packaged and repackaged easily… and that offers a very reasonable migration path between oper­ating system variants, and that means that they can insulate you from some of the more political deci­sions made at the OS level (Upstart vs. SystemD vs. SysV3 being the one we’re most concerned with here). What a beau­tiful, logical chain of run-on sentences!

Now that we’ve powered through some back­ground, pain points, and philosophy, let’s take a break. I’m working on part two, where I take a high level look at what I consider to be the two most service agnostic, highest quality options available: Super­visor and Monit. That should hope­fully have some graphs. Nerds love graphs!

  1. There’s a ton of thought behind argument parsing and using named para­meters vs. posi­tional para­meters (XKCD forum, Peter Szilagyi, Greg Wooledge, SHELL­dorado). You should maybe read some of it if you’re writing a shell script that people (read: you and anyone who isn’t you) will have to use. There’s some excellent libraries available for almost any language, but if you’re passing more than a handful of argu­ments (posi­tional or otherwise) to a shell script, please consider rewriting your tool in a more robust language. ↩︎

  2. Full disclosure: I’ve contributed a few patches to Dante. If you really, really care about this sort of stuff you should read Jesse Storimer’s Working With Unix Processes. Daemons, Daemon-kit, and Dante are sort of the go-to examples in Ruby land. Python has python-daemon, YapDi and a well-written overview of standard daemon behaviors. ↩︎

  3. Pop quiz: what init system does your production stack use? SysV Init (Pretty much every­thing before 2006, current Debian stable builds)? Upstart (Ubuntu since like 2006, RHEL/CentOS 6)? SystemD (Fedora, future Debian stable builds, and probably RHEL/CentOS 7 whenever they come out)? InitNG or RunIt (Maybe if you’re using Gentoo, Arch, or Linux from Scratch)? Accepted best practice is to target plain old vanilla SysV if you need porta­bility, since Upstart and SystemD both provide compat­i­bility wrappers (albeit to different degrees). ↩︎