Kerberos with SSH, Part 2: debugging
... Part 1 here; we left off where it _should_ have worked but it didn't.
So your fancy SSH public key / kerberos / etc. setup should be working. You did all the things to make it work, and yet...
you@client ~> ssh host.domain.example.com
you@host.domain.example.com's password:
... in other words, it didn't work.
You can possibly try
you@client ~> ssh -vv server.domain.example.com
("-vv" for very verbose)... and you might be able to figure out what went wrong; nevertheless, many times the interesting reason for rejection resides on the server side. It'd be nice to get some nice debug printouts from the ssh server... so... we edit the config file, restart it, try logging in, and... if it's a remote server, hope that we didn't accidentally mess things up...
... or...
server ~# sshd -p 12345 -D -d -e
debug1: sshd version OpenSSH_7.9, OpenSSL 1.1.1d 10 Sep 2019
debug1: private host key #0: ssh-rsa SHA256:(...)d
debug1: private host key #1: ecdsa-sha2-nistp256 SHA256: (...)
debug1: private host key #2: ssh-ed25519 SHA256:(...)d
debug1: rexec_argv[0]='/usr/sbin/sshd' d
debug1: rexec_argv[1]='-p' d
debug1: rexec_argv[2]='12345'
debug1: rexec_argv[3]='-D'
debug1: rexec_argv[4]='-d'
debug1: rexec_argv[5]='-e'
debug1: Set /proc/self/oom_score_adj from 0 to -1000
debug1: Bind to port 12345 on 0.0.0.0.
Server listening on 0.0.0.0 port 12345.
debug1: Bind to port 12345 on ::.
Server listening on :: port 12345.
This is a regular ssh server, running on port 12345 (which we just made up), in addition to the ssh server we're already running. We also have:
- -D so that it doesn't detach from the terminal (so that we see all the error messages)
- -d for there to be a lot of debug messages
- -e so that these debug messages end up being written to stdout.
As a result, we get a lot of debug info, but we don't clutter up the logs for anyone else, and we don't mess with the already-working ssh server (... even if we edit its config file, it's already running, so we have a better chance for recovery).
From here, debugging is often a lot easier:
debug1: userauth-request for user simon service ssh-connection method gssapi-with-mic [preauth]
debug1: attempt 1 failures 0 [preauth]
ohh look what just happened. What the hell is "preauth" and how do we fix it?
Well, without going into details, as it turns out...
(... and this is where I just did an hour of debugging...)
... it's the thing from Part 1, the "Kerberos realms should be upper case" one!
You might ask "how is this obvious from the debug logs"?
... well it really isn't. Apart from strace-ing sshd and staring at the config files it reads. Yes, debug messages are still terrible.
However... there is yet another thing that can go wrong!
debug1: attempt 1 failures 0 [preauth]
debug1: Unspecified GSS failure. Minor code may provide more information
No key table entry found matching host/localhost@
debug1: userauth-request for user simon service ssh-connection method gssapi-with-mic [preauth]
I would've never guessed that "unspecified GSS failure" would be a friendlier error message in comparison, but... apparently... it is. Since: note the "localhost" part. Apparently, the SSH server is trying to figure out who exactly it is; since our /etc/hosts currently reads
127.0.0.1 localhost server
::1 localhost localhost ip6-localhost ip6-loopback
it trying to resolve itself (to figure out the domain / realm it belongs to) leads back to "localhot"; we're trying to log in with "DOMAIN.EXAMPLE.COM", so that doesn't quite match up.
Thus, we can fix our /etc/hosts this way:
127.0.0.1 server.domain.example.com localhost server
::1 server.domain.example.com localhost localhost ip6-localhost ip6-loopback
which we can test by trying to ping localhost (... it should print the full domain name).
(... we'll reserve "preauth" for Part 3, if it ever ends up happening.)
This is post no. 16 for Kev Quirk's #100DaysToOffload challenge.
... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.