Where GRUB finds things
... "MBR" edition; there might be some overlap with the UEFI version though.
When you install a Linux distribution on your hard disk, one of the steps is to make it bootable, using GRUB. There is also the grub-install command that can do this for you, manually.
This is probably all you need to know, unless:
- your computer has multiple hard drives
- you keep adding and removing them
- you also don't want to put GRUB on the hard drive you're booting from, because your SAS controller + BIOS is stupid
- you're also doing the installation from a live USB stick because everything is broken already.
Nevertheless, the experience described above made me think a little bit more about the fact that I have no idea how GRUB actually found things.
I'm still not sure of the ways it can find things, but... at least I have some reasonable guesses. Namely:
For MBR / BIOS setups, it is the MBR that will be executed first. It is very tiny and will not be able to find anything in actual file systems. What we'll need is GRUB's core.img, which is smart enough to do this (... but still doesn't include all the plugins ever). The default location to put this is actually outside of actual partitions, in the gap between the MBR (which is the sector at the start of the disk) and where partitions actually start; this way GRUB's MBR code can directly point at it and have the BIOS load it. Various operating systems put additional pieces of data there, too, sometimes; GRUB is trying to be as resilient as possible against this. Meanwhile, if this doesn't work for whatever reason, the actual core.img content can also sit on the boot partition, with a hardcoded offset being put into the MBR (... with the obvious caveat that this won't work if files end up being moved around on the boot partition).
Once we have core.img, we need to locate the actual boot partition, typically used for multiple things:
- to read further GRUB modules from
- to read GRUB's config from (including the boot menu)
- it also stores the kernels we'd want to launch.
If it can't find this, what you'll get is "rescue mode" GRUB, with no boot menu and fairly limited functionality. However, this is fixable: you just need to specify the "prefix", with its full GRUB-style path name, to find these things.
This is also what the argument --boot-directory of grub-install is for: it points the installed-to-hard-disk parts of GRUB to a boot partition, that might actually even be on another disk. (... I'm not quite yet sure what identification mechanism is being used here though. Is this resilient enough against reordering of disks? It... might be?)
Once it has the "prefix" partition and directory, GRUB can then load its config, produce a menu, and launch a kernel.
What grub-install actually does
- it puts core.img somewhere reasonable (see above)
- ... it also points it towards the boot partition
- it sets up the MBR so that it knows to find core.img
- it adds GRUB modules to the boot partition.
How can this break?
As mentioned before, grub rescue mode (with a prompt of "grub rescue>) happens if GRUB could load its core but can't find the boot partition. This article has a nice description of how to set the prefix to help GRUB find its config and get a boot menu (if everything else is working reasonably).
Just getting a plain GRUB command line ("grub>) means that although GRUB components could be loaded (probably from the boot partition), there isn't a reasonable menu available in the config (... or no config is present). This might happen as grub-install will not only set up the MBR and core.img, it'll also nicely prepare all the modules (in "/boot/grub" by default); if, for example, you chroot into your Linux filesystem and try running grub-install without also mounting "/boot" (assuming it's a different partition), you might run into this: your root file system will now have a perfectly functional set of GRUB modules, but no config (which is sitting on the real boot partition instead). Re-running grub-install after properly mounting "/boot" might help.