If Sockets had Paths
After all these posts about how TCP ports and their numbers are stupid... here is how it could have looked like.
Imagine if we just took the example of UNIX domain sockets, and extended them to work over The Internet.
As in: instead of some.host.name:1234, you would connect to some.host.name:/some/socket.
How does TCP work currently?
It's all happening over IP. (It's TCP/IP for a reason, after all.) Here is how a TCP header looks like:
All this is sitting in an IP packet, with the following header:
(... you can find a nice combined depiction here.)
As you can see, to address a port, you need a 4-byte IP address (... for IPv6 at least) and a 2-byte port number. Which... has all kinds of problems:
- port numbers as addresses are not great
- also, they aren't really unique enough, so you also need the IP address to be bundled with them... which really doesn't play well with IP mobility.
We can't really just replace port numbers with arbitrary-length strings though: stuffing an entire string into each packet would be really wasteful.
The alternative
However... no one said that establishing a connection should be as cheap as sending packets through it!
Namely... what if there was two kinds of packets:
- one that establishes connections, with the full string, agreeing on a session ID (e.g. a 4 byte one, to avoid collisions)
- ... and then each subsequent packet needs only a session ID to be routed to the right process!
That way you could also have connections that don't necessarily break if one side changes IP addresses: if you get a packet from a different IP, you just update the address of your counterpart!
... at least you do so after ensuring that you're still talking to the same endpoint. Which probably needs some cryptography.
... so when will we get here?
Probably never. Since replacing already working protocols with new (even possibly better) ones is pretty much impossible.
Or... actually...
... have you ever looked at QUIC?
The protocol that does HTTP over UDP?
Because it's supposed to be lower latency and resistant to IP changes?
(Here is some more details of what's going on with it.)
Put together with HTTP/3 (... which is HTTP over QUIC, basically), it's an alternative stack that:
- ... uses hierarchically organized strings (URLs) to establish connections, which...
- ... then get their own connection IDs
- these are resistant to IP changes
- and... sure, it's HTTP, but if you pile websockets on this, you can make them bidirectional.
So, as it looks like, we managed to invent named-by-strings "sockets", except it's over UDP (which has its own port numbers), and per-application (instead of handling said strings on the OS level).
Also, it's probably a lot more complicated than as if it was implemented about 30 years ago. But... it does look like we did get there already!
... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.