Optimizability

2022/02/16

... being simple to optimize is more important than speed itself.

This is post no. 81 for Kev Quirk's #100DaysToOffload challenge. The point is to write many things, not to write good ones. Please adjust quality expectations accordingly :)

"It's okay that it's slow; computers are fast anyway, and we can just optimize that tiny bit of code that does need to be fast."

Proponents of "scripting languages" (... whatever the definition might be), various VMs and Web things often come up with this answer to the issue of their languages not being especially fast. After all, "performance" is not an issue anymore most of the time. It'd be silly to write everything, including code that runs only once every day, in assembly code optimized real hard with SSE instructions. Or C, for that matter.

Of course, the problem is that there are some parts of code that do need to be optimized. Inner loops doing image processing, for example. Or even look at ninja the build system: making it fast was an explicit design goal.

However, even there, it's just a tiny bit of code. This is where DJB's "The death of optimizing compilers" article was arguing for: there is no point making a compiler try optimizing everything the best way an optimizing compiler can, since the biggest difference will be made by actual humans working on the hottest spots anyway. The focus, instead, should be on making this (explicit) optimization easy, whenever this needed.

... so we can use slow languages most of the time?

This is, actually, where most of the difference in attitudes is coming from. Since... well, technically, yes: this means that you could write most of your code in Python, profile the thing, discover which parts are slower, write extensions in C and / or assembly to speed up those parts; done, easy, and you walk away with most of the gains. Right?

... except... in real life, this is not how it ends up working.

In real life, once you conclude that your Python code is slow... you will maybe try speeding it up in Python. Because that's easy. And you're, first and foremost, a Python programmer, that's why you're using it.

And then... well, you could write an extension, but then you'd need to move all your data across a programming language boundary, which... is probably not really worth it, for a tiny theoretical speedup.

Of course, if you gain 10x speed, you might end up doing it. (... unless your code is already fast enough on things you're running them on. Which might or might not be the kind of data some others are running it on.)

... alternatively...

... consider a language that might be even slower than Python. But! you can incrementally optimize it into something that's really fast.

Lisp might or might not be a good example for this; you can incrementally optimize it into C-like speeds, but... it might fail at the "being slower than Python" part.

Python, in fact, does have Cython, which can take code that looks a lot like Python, and, with the right annotations, turn it into something really fast. The problem is... Python's packaging system is impressively terrible, so "adding a Cython module" is still a big barrier. (... along with the part that it's still not entirely the same language.)

Afterwards, make it easy to diagnose slow things. (... Cython, with its annotated HTML files, is fairly neat for this!)

What's important though is that, optimally, you shouldn't have to make a jump from "this is the slow programming language" to "this is the fast one". It should be different shades of "slow", with more and more implementation-specific ugliness added as we get faster and faster.

... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.