This is still the header! Main site

Port numbers and SRV records

2021/08/26

... yet another post about TCP ports.

This is post no. 28 for Kev Quirk's #100DaysToOffload challenge. The point is to write many things, not to write good ones. Please adjust quality expectations accordingly :)

Remember the grand conclusion about how TCP ports and their numbers are stupid? (If you don't: here is the article.) That it's just weird how everything else has a decent hierarchical namespace, and yet, these somehow... don't?

Well, as it turns out, they do, except we somehow ignored the entire concept because Reasons. (... maybe the Web took over and we just started ignoring everything that's not port 80 or 443?)

... because, of course you do know what port 80 and 443 is. (If you don't, I'm somewhat surprised why you're reading part 2 of a rant about Internet history and TCP ports; if you're still insistent on going on with this: here is a hint.)

And here comes my point: in an ideal world, you really shouldn't know what port 80 and 443 is.

DNS

Once upon a time, there were only IP addresses. Although IPv4 addresses are just four bytes, people like names better... so domain names and DNS were invented, to avoid having to remember the numbers.

Which is the simplest-to-understand reason why DNS exists. Regardless of why it was created though... there is a really nice side effect: you can swap out the server IP under a website, without changing the domain name everyone remembers / has bookmarked / hardocoded / etc. Which is great, given how IP addresses kind of depend on where your server happens to be, ISP-wise.

Or take virtualhosts. Same IP, different domain name... well it's kind of a hack, but it does happen to work.

Or having DNS point to servers that are the closest geographically.

But before DNS was a thing, there was...

/etc/hosts

It worked equally well for not having to remember IP addresses. Of course, it fails when it comes most of the other benefits of DNS: it's still a rigid association, set in stone whenever the sysadmin decides to pick up an IP and give it a name.

... how about ports then?

Fun fact: there is an /etc/hosts for port numbers! It's called /etc/services. It does pretty much the same thing: it translates service names to port numbers.

It's still larger though than /etc/hosts on relatively modern systems. Sadly. We'll see why this is sad later.

Meanwhile... there is a DNS equivalent to host - IP matching, too! It's called SRV records. Here are some examples from the Wikipedia page:


# _service._proto.name.  TTL   class SRV priority weight port target.
_sip._tcp.example.com.   86400 IN    SRV 10       60     5060 bigbox.example.com.
_sip._tcp.example.com.   86400 IN    SRV 10       20     5060 smallbox1.example.com.
_sip._tcp.example.com.   86400 IN    SRV 10       20     5060 smallbox2.example.com.
_sip._tcp.example.com.   86400 IN    SRV 20       0      5060 backupbox.example.com.
        

Basically, you're mapping a service / protocol to a host and a port.

This, by the way, solves most things that are really bad about TCP ports! You want to run multiple instances of the same service on the same machine? Just put them on different ports and point different domains at them! Want to have an actual name for your protocol? Sure, go ahead!

Want your web server to serve multiple sites, all this over HTTPS? How about just creating an SRV record for each of them, and having the web server listen on a separate port each?

This is the way virtualhosts work, right? ... right???

(no.)

The way virtualhosts really work involves sending over the server name over port 80. Because port 80 is HTTP and HTTP is port 80 and that's how things work.

Ohh also, browsers, in their infinite wisdom, will not do SRV record lookups. People tried to get patches in. Not a lot of results though.

Meanwhile... HTTP is only mildly ugly without SRV records. HTTPS, on the other hand... well, you see, in order to connect to a server securely, it needs to send you the right certificate, which depends on which domain you really want to request. Of course, because HTTPS is port 443 and port 443 is HTTPS and this is how things work, all the server sees is a connection on its serving IP; it has no idea which virtualhost the client wants until the client sends its HTTP request. Which is supposed to be encrypted. Using the cert that we can't send out until... yeah. It's awesome.

So the "solution" to this is to break the neat layering of protocols and stuff domain names into TLS (the encryption layer) itself. Which is sad and stupid and would have been avoidable. If only we had SRV records.

On the plus side... XMPP does use SRV records!!! So it's not all sadness all the way.

... so...

... we were late. By the time SRV records happened, we had a bunch of protocols and services already, all sitting on fixed ports... similarly to internet (... arpanet?) hosts, sitting on fixed IPs. Keeping /etc/hosts files in sync with everything stopped working faster though; we had to switch over to DNS, with neat side effects. We could keep up with /etc/services... so we lost out on all the DNS-based benefits there.

And, of course, the Web went ahead with its hacky solution instead of the right one (as it did many, many times)... so this is what we happen to have.

... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.