UNIX and long-running processes

2021/08/23

This is post no. 27 for Kev Quirk's #100DaysToOffload challenge. The point is to write many things, not to write good ones. Please adjust quality expectations accordingly :)

The traditional way you use UNIX (as a programming language) is by launching tiny processes that cooperate well. Piping is cool enough, but the real power comes with processes being able to execute other processes. Think of xargs, for example.

Taken to the extreme, you get Bernstein chaining. The idea is that you can have a process that launches into another binary while keeping the actual process (PID, file descriptors, etc.), you can produce interesting combinations that would otherwise need custom code in a single process. For example, you can open a TCP port, start listening, then launch into a program that drops privileges; in turn, this can launch into the actual server process that accepts connections and launches UNIX commands on the resulting two-way network connection pipes. s6's execline uses heavy use of this, essentially replacing the entirety of a shell interpreter with nicely crafted, very long command lines; another example is their network suite, which you can use to construct a TCP server without ever writing C code doing the "networking" part.

This makes it easy to write CGI-like software: a long-running server launches short-lived processes that serve requests. However, sometimes, you have the opposite problem: what if you have a long-lived process that you'd like to interact with from the command line?

For example, wpa_supplicant is a long-lived wifi daemon, managing connections. While it's running, you can use wpa_cli to communicate with it... they probably use a UNIX socket with some custom protocol in the background.

Another one is a typical web application. The original solution for dynamic web pages was CGI: essentially, for each incoming request, the web server launched an executable (often a Perl script) which then read headers from environment variables and wrote the generated page to stdout. You could use any programming language, it was easy to use... but then also, starting each process was kinda slow, and most definitely didn't work well for things like Java. So we invented protocols (FastCGI for example) that talked to long-running processes and let them process multiple requests. Of course, we lost a lot of the simplicity in the process.

... what is a process, anyway?

It's just a bunch of threads sharing the same memory mappings. You can create new ones by forking, and replace all the memory mappings with new ones by calling the exec system call.

While doing this, we get some nice interfaces to help communicating with the new processes. We keep passing along file descriptors (... stdin, stdout and stderr among them), we can give the new process environment variables, and we also have a command line: a list of strings. Much of the niceness of UNIX is a consequence of the simplicity and universality of this interface. Everything is a file, files have desciptors, and you can launch executable files with parameters.

And yet, once a process is running, basically the only way to initiate talking to it is sending it signals. Which system is... not overly flexible.

What if we had something that's a bit fancier than this?

Calling into existing processes

Now, imagine the following:


~$ wpa_supplicant &
[1] /run/wpa_supplicant_3
~$

Note that instead of a job number or pid, we get an actual path. This is UNIX, everything should be a file. Including processes.

Now... we could just


~$ /run/wpa_supplicant_3 list_networks
Here is the list of all networks:

some_SSID
some_other_SSID
~$ /run/wpa_supplicant_3 connect some_SSID
Connecting to some_SSID... done.
~$

Of course, this is nothing new, in terms of functionality: a program can open a UNIX socket somewhere, and then there can be another program, opening the socket, and talking to it, using some protocol. Except... what if, to implement this, you could just write something like...


int main(int argc, char** argv) {
  do_startup_stuff(argc, argv);

  message_t *m;

  while (next_message(&m)) {
    switch (m->argv[1]) {
     case "list_networks":
       fprintf(m->fds[1], "Here is the list of all networks:\n\n");

       for (network_t net : networks) {
         fprintf(m->fds[1], net->name);
       }
       break;
       // ...
    }
  }
}

Yes, this is a message pump. For a process. (If this sounds familiar from Windows, it's not a coincidence.) Except... it has all UNIX-y goodness: "messages" are essentially the same as launching a process, with its own argument list, even file descriptors. From the outside, it looks like you launched a program. In fact, you launched into an existing program, in a way.

(Yes, this is not entirely C, but... real C is uglier.)


struct message_t {
  int argc;
  char **argv;

  int fnc;
  int *fnv; // File descriptors present when calling into the process

  int uid;
  int gid;
  char *current_dir;
};

With this API, you can stay single-threaded if you want. You can fork for each request if you want. You could even do some permissions checking, if messages included uids and gids.

Of course, there is nothing here that is overly complicated to implement using UNIX domain sockets and some libraries. All you need is a command-line program that packs its arguments into one binary blob, sends it over to a listening process... and a scheme to associate UNIX domains sockets with running processes. Some parser on the other end. Unless, of course, your programming language is platform-independent enough not to support UNIX sockets. Or you couldn't really choose between the 8 competing and mutually incompatible implemementations. Or you realized that you could just as well use dbus, and three days later you still haven't figured out all the data types and namespaces.

Meanwhile, Windows went the Windows way of packing everything ever into a couple fixed-size integers and ugly pointers (see: MSG. And yet, it worked out fairly OK for them. It's a lot better than socket-opening and signals.


typedef struct tagMSG {
  HWND   hwnd;
  UINT   message;
  WPARAM wParam;
  LPARAM lParam;
  DWORD  time;
  POINT  pt;
  DWORD  lPrivate;
} MSG, *PMSG, *NPMSG, *LPMSG;

Either way: having a standard way of communicating with long-running processes would be somewhat neat. If it didn't involve parsing complex protocols, that'd be even more neat.

(... actually though... writing such a library, even with an ugly hack instead of a true OS service under it, might be still fun?)

... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.