This is still the header! Main site

Imagining State in Programs

2024/05/18

"Right now, I have to really concentrate and I might even get a headache just trying to imagine something clearly and distinctly with twenty or thirty components. When I was young, I could imagine a castle with twenty rooms with each room having ten different objects in it. I would have no problem. I can't do that anymore. Now I think more in terms of earlier experiences. I see a network of inchoate clouds, instead of the picture-postcard clearness. But I do write better programs."

- Charles Simonyi

(I'm reasonably convinced that at some point he actually said this, despite there aren't that many sources I could find it at.)

Math is hard. You have layers of concepts building on each other; you have to understand all the layers below reasonably well to be able to even comprehend the questions above. (Take, for example, the convergence of a series. It's about there being a limit L so that for every distance epsilon, you can find a position N in the series so that the rest of the elements are closer to L than epsilon. It's nontrivial. Unless you are familiar with it, in which case it's easy and you can go off building something out of it that is yet again nontrivial.)

There are parts of programming that are hard in similar ways. Understanding how red-black trees work is also a bit like math.

Most programmers, most of the time, are not dealing with questions like this though.

In practice

The actual set of concepts being used in most day-to-day programming is not especially deep. The programming language itself has its own share of numbers, strings, lists, and various other data structures; being able to use these is not too hard. Also, if you take any 200 line subset of a big project, you are unlikely to find arcane pieces of magic concentrated in it.

And yet, programming is still hard. Most people are bad at it. The ones who are good get things done a lot more efficiently than others.

Part of the reason is that, although these things are not hard, there are just a lot of them.

As it turns out, the real world is a lot more complicated in practice than in theory. What starts out as an elegant mathematical model gains more and more moving parts as time goes by. Part of what makes good code is separating all these moving parts into various compartments so that you can get things done without thinking about all of them at once.

But you still need to know what's going on in at least some of the problem space. After all, you're just about to get your soul printed into a piece of code; things that the program will be calculating are a reflection of what is going on in your head while thinking about it, so you got to get the thinking right first.

Typically you do this by reading the existing source code and figuring out what the code is doing. Then, if you really don't understand what's going on, you put in some logging and stare at the results. Or you pull out a debugger as a last resort.

Meanwhile, in the rest of the world

Excel exists.

You most definitely don't have to be a programmer to understand what's going on in an Excel sheet. There is a two-dimensional grid of numbers. It's either actual numbers or formulas that take values from other cells. You can typically look at them and figure out whether they make sense or not. It's really not rocket science.

You can also just modify the numbers or the formulas and immediately see what happens if you do.

Spreadsheets, in fact, were a major achievement in computer science. They were a big selling point for computers in the 80s: there was a decent amount of competition between multiple companies (e.g. Lotus and Microsoft) trying to ship the most functional and fastest spreadsheet package. It makes sense: the finance world, apparently, still runs on Excel.

And yet, spreadsheets couldn't really calculate anything that you couldn't have done with a big chunk of COBOL source code on the company mainframe. They just made them easier and faster to do.

But why is it easier and faster? Aren't real programming languages more powerful?

Visibility, yet again

You don't need to see things on the screen once they are in your head.

If they already are, you can get much farther implementing actual solutions instead of pretty visualizations and nice user interfaces showing you what's going on. (You already know anyway.)

It's a tradeoff, all else being equal. It's much harder to build a system that is ready to run in half a second versus one that takes five minutes to compile (if they both have to have the same functionality).

So you just take the people who are good at keeping things in their head to build things for you. They are good at it. They wouldn't be doing much better if things were more visible. (Up to a given level of complexity at least.)

The conclusion is then drawn: programming does require a good skill of imagining program state, without actually observing it or poking at it. After all, people who are programmers are good at it.

But what if we had a lot better observability tools? Debuggers? Visualizations?

An example is the Lisp way of writing code while your program is running. You don't have to guess what your inputs will be if they are right in front of you.

Are we still missing advances that make some programming pointlessly high effort? And are we just selecting for people who are good at tolerating this?