This is still the header! Main site

Programming Languages as Interfaces

2021/12/26

... are "languages" in multiple senses of the word.

This is post no. 58 for Kev Quirk's #100DaysToOffload challenge. The point is to write many things, not to write good ones. Please adjust quality expectations accordingly :)

There was much told about how the choice of programming language for your project matters a lot. Or that it doesn't really matter that much.

Or that it's OK to sacrifice speed if your language is more expressive: hardware is cheap. Or... maybe it is not okay to sacrifice speed because you get slow, garbage software that is a pain to use.

The overall assumption though is almost always that the "language" part in "programming language" is about something that humans use to talk to computers. Which is, of course, part of the story.

Part, but not all.

The way programs talk to each other

Programs are, after all, little soul fragments of programmers who wrote them. A C++ library is a mirror image of someone's mind while they were thinking in C++, with pointers, types, templates flying around in their heads. To talk to this library, you also need to think in C++, otherwise the library won't understand what the hell you want, in terms of concepts.

Ohh so you happen to be writing Java code? Well, time to teach it a little C++ then.

Bindings and separation

The typical answer to this is just... "write some neat wrapper code". You can totally use OpenCV from Python, for example! Or, actually, numpy itself is doing matrix multiplication in native code; this is not something most people need to care about.

The key insight here though is that being able to call into another language is not enough of an abstraction layer to avoid speaking the language entirely. numpy is not a C++ project with Python bindings; it's a Python project with some C++ parts, and its authors do know how to deal with C++. If all you do is expose some APIs from another language, whoever is using it will still need to think in that other language (e.g. keep track of object lifetimes, instantiate templates in bindings), plus be aware of the pecularities of the binding itself (whether a call is copying its inputs or is it just taking them by reference, etc.). This is a lot of extra complexity.

You can hide this by wrapping all of it in more Python. You can hide this really well if you just wrap it into UNIX, only exposing a binary that you can call with a list of string parameters. You will, however, either lose a lot of power / speed (as a command line utility) or need to do a lot of extra work (as a Python library).

Workarounds?

Trying to harmonize concepts in programming languages goes a long way. Take Microsoft's CLR, for example. You can write half of your project in C#, the other half in Visual Basic, without most of the difficulties listed above. After all, CLR languages share

... taking some things from the page linked above. All of this is also not entirely unrelated to Microsoft's COM, also targeting language interoperatibility.

It's also not a coincidence that C/C++ code talking to COM is tricky and occasionally ugly (... we're not talking actual native C/C++ here; it's COM!). Or that Visual Basic 6.0 (which was an actual language) is a lot simpler than VB.NET (which is more about talking CLR with a Basic syntax).

There is no free lunch. You need to speak the language of the code you're talking to, not just the language you're writing your project in. Given the amount of libraries you need to use, the former part might be a more important factor than your own code.

Solutions?

Just stick with one language if you can.

It helps if your language has decent performance. Otherwise, you might be tempted to rewrite bigger and bigger parts in C; you'll probably end up with a horrible mess, with more glue code than there should be code in total.

There are people (Casey Muratori for example? citation needed though) who think that instead of using "scripting languages", we should just go for straight C; it's not that bad. Here (YT link) is Jon Blow, from the same circles, explaining why they are a bad idea. (Perhaps not coincidentally, he's writing his own language.)

He's only talking about scripting languages; I'd extend this though to any language other than the one your libraries are written in. It's the same story if you're trying to use a Javascript API from WebAssembly C++.

A neat counterpoint is, of course, Lisp Machines; it's Lisp all the way down. No need for glue code for C code; the OS is in Lisp, your code is in Lisp, it just works, in a way that's much more powerful than (especially contemporary) UNIX.

Of course, the very same principle makes Lisp a much weaker choice if your OS is UNIX and all your libraries happen to be in C / C++. There is a reason why most UNIX utilities are still written in C; it's just simpler.

Also, don't forget TempleOS, as an example of how you can write a great OS (yes it's awesome, read the linked article) only in a C dialect, and how this gives you more power than the mix of UNIX-the-language and C that UNIX systems are written in.

... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.