This is still the header! Main site

Sockets: part 1

2021/04/04

... in which we start discussing sockets... without really discussing sockets.

tl;dr: sockets are... just... bad.

Hypothetical horror story

... as you might remember from your intro Computer Usage classes, computer systems store data in files. You might be more familiar with a directory listing like this:

Now, unlike those, we're working with highly advanced UNIX machines here. "File names", like "3D Reversi" in the above listing, are clearly important for novice users, but we no longer need them once it comes to UNIX; we're aiming for higher performance, after all. Instead, what we have is:


instructor@unix-main $ ls -N
total 20477

-rw-r-xr-- 1 root root 16086 Apr  4 22:08 1
-rw-r-xr-- 1 root root 32030 Apr  4 22:08 2
-rw-r-xr-- 1 root root 73892 Apr  4 22:08 3
-rw-r-xr-- 1 root root 13M   Apr  4 22:08 4
-rw-r-xr-- 1 root root 3.3G  Apr  4 22:08 7
-rw-r-xr-- 1 root root 37828 Apr  4 22:08 12
(...)
-rw-r-xr-x 1 root       root  16086 Apr  4 22:08 1023
-rw-rw-rw- 1 users      users 38212 Apr  4 22:08 1024
-rw-rw-rw- 1 instructor users 8.3G  Apr  4 22:08 1037
-rw-rw-rw- 1 jbig       users 1.3G  Apr  4 22:08 1038
-rw-rw-rw- 1 accnting   users 372M  Apr  4 22:08 1082
-rw-rw-rw- 1 dev        users 37292 Apr  4 22:08 1028
        

As you can see, files on a UNIX system are numbered from 1 to 65535. Note that we're printing them in their numeric form; normally, for example, we'd see "libc.so" instead of file number 4, for the sake of people who do not remember the obvious fact that "4" is the C library.

Also pay attention to the columns on the left! Unlike personal computers, UNIX is a multi-user system, so one must have some control over who can write and modify which files. We accomplish this by a sophisticated, two-layer security system: files with names 1023 or below are writeable only to the superuser, conveniently preventing users from messing with important files like 4, 33, 111 or, of course, 374. On the other hand, files with names ranging from 1024 can be written to by users, too! ... any user, that is. I don't think it's particularly likely that anyone would want to prevent other users from reading or writing to their files, but in this case, you might want to consider the excellent encryption library called "587".

Frequently Asked Questions

How do I give my file a name? Of course, for the best results, you register it with the Assigned File Numbers Authority so that others don't try to read e.g. your cat picture named "7772" as a phone book. (Of course, this is just a hypothetical example; cat pictures were already assigned the range 8700-13399.)

What if my file / program is relevant to a couple systems only? You indeed might not want to register it globally; your solution is using the file mapper! Just pick a number for your file (no, this is a different number), and add it to file number 111 just like this:


$ cat 111
  program vers   file
   100000    2    111  portmapper
   100000    2    111  portmapper
   100003    2   2049  nfs
   100003    3   2049  nfs
   100003    4   2049  nfs
   100003    2   2049  nfs
   100003    3   2049  nfs
   100003    4   2049  nfs
   100024    1  32770  status
   100021    1  32770  nlockmgr
   100021    3  32770  nlockmgr
   100021    4  32770  nlockmgr
   100024    1  32769  status
   100021    1  32769  nlockmgr
   100021    3  32769  nlockmgr
   100021    4  32769  nlockmgr
   100005    1    644  mountd
   100005    1    645  mountd
   100005    2    644  mountd
   100005    2    645  mountd
   100005    3    644  mountd
   100005    3    645  mountd
        

On the left side is the number you made up; on the right, an actual file number. Of course, the "names" are just for informational purposes only. This way you can even map multiple versions of similar file to actual file system entities! Isn't this impressively flexible?

... but I have many computers and these might change dynamically...? Well okay. Let me tell about the Domain Name System (DNS), which also covers files and their names. You can use the so-called FILE records to assign names to certain files on some hosts, in addition to looking up their IP addresses... so that you can perform a lookup that e.g. company-logo.example.org refers to file 428 on the machine with IP address 122.211.48.23.
Please be aware though that giving names to files this way is a fairly obscure thing that not a lot of software respects; not putting your company logo into file 534 is highly unusual anyway and probably only should be done if, for some reason, you can't put it there. (Please pay attention though that updating company logos should be probably only done by root.)

I have multiple company logos. Now, we've come to really advanced material; you should already be familiar with all the systems and methods described above to try this one. Namely: namespaces might allow you to have an entire different set of files, with the same names, as if you had a different computer! So... while your main namespace could have one company logo at 534, you could create a different namespace, where the same file 534 could serve as another, entirely different logo. Moreover... you might even gain ways of preventing users from overwriting your files, even though they are higher than 1024... just by separating them into a different namespace. Aren't the possiblities in UNIX limitless?

... 40 years later...

Modern technologies offer immense possibilities! Remember how early UNIX machines only had numbers for file names? Wasn't that ridiculous?

It's really good that we built something much nicer on top.

I'll save most of the details for Part 2; let me summarize how a modern UNIX system works, though. Well, some data is still stored as numbered files, but... most of it is in a ZIP file called 443. This file employs sophisticated encryption methods to make it sure that only authorized users can access protected parts!

Of course, since it's file 443, it's still only writeable by root. However, employing some clever mechanisms we're not going to detail here, your sysadmin can set it up so that you could have your own encrypted ZIP file, with an entire, named directory tree in it!

Naturally, this two-layered system is a little bit more complex... but isn't it worth it for your own actual files that you could just create whenever you wish?

(... to be continued. Possibly with weird solution ideas. Or more of these slightly deranged hypothetical scenarios.)

... comments welcome, too, either in email or on the (eventual) Mastodon post on Fosstodon.

This is post no. 1 for Kev Quirk's #100DaysToOffload challenge.