Apologies for absence

Some thoughts on coding badly

May 05, 2023

Apologies for absence. I have been in a dark place, reached through an icon that is labelled Visual Code.

I love hearing when you’re programming, said my unwife from the room next door, as another volley of “fucking shitbags” rang out after an error message appeared from the depths of the screen. I was sure that I hadn’t myself typed a (fucking) per cent sign, so why was the program complaining that it couldn’t understand one? Then answer turned out to be chat-GPT, and in so far as this latest bout of the fever brought anything new it was an understanding of what this kind of supercharged autocorrect is useful for and how it is dangerous.

Understand, I am not a real programmer. I make small things that are useful to me as an occasional hobby when I start using a piece of software that almost does what I need. I’ve never yet found a word processor that will transpose the two characters around the cursor without being tweaked, so I have written routines to do that in at least four programs I’ve used since this is the most common mistake my fingers make. Around the beginning of the century I wrote a bunch of scripts to convert old articles into HTML for my blog and web site. Later I wrote something to make a raspberry pi function as a granny cam to keep an eye on my dementing mother at night and alert me if she fell out of bed.

In the runup to the Brexit referendum, I wrote a script that counted the women’s body parts, the “migrants” and the Kardashians on the Mail Online every morning and turned the result into a spreadsheet. Alas, the Guardian thought it was too tasteless, but it was actually a surprisingly accurate guide to the temperature of the debate — every time the Leave campaign looked to be in trouble, the “migrant” count would zoom up and at two moments of real crisis it overtook even the Kardashians.

There is a distinct pleasure to this kind of scripting, more like Sudoku than chess in that each successful move seems to entail the next move but you don’t know how it’s going to work out right. With chess problems you have to have an idea of the solution before you can find the steps towards it.

Nothing in those experiences had prepared me for last fortnight’s problem. I keep personal emails from ages back but I never look at them systematically: they are there to be searched when I need something. And there are no more than a thousand or two every year. It was incomprehensible that there should suddenly be 4,700,000 of them in one archive folder. But there were. The file was immense — 193GB. This was much too big for the email server to handle, so I couldn’t use any normal program to investigate it. I had to write a python script to download only one random day’s email. In due course, after perhaps an hour of swearing, thousands of messages came down from the 20th July 2020 — none of which I remember reading. The overwhelming majority were duplicates of spam: there were for instance 200 copies of an advertisement for “Nutritious slimming soup”, all with the same sender, subject, date and time.

This showed me how I could identify duplicate messages. The method looked simple enough: grab the headers of a bunch of messages; look at each one in turn. Mark it as unique if it has not been seen before, and add it to the set of unique messages. Otherwise, if it is already in that set, delete it. When all of the bunch has been sorted, start the next one. Repeat until rapture.

The next ten days were spent getting this simple plan to work reliably. Grab too large a bunch of messages and the program broke. Grab too small a bunch and was unendurably slow. Although everything the server sent me appeared as characters on my screen, most of it was stored in the computer’s memory as numbers, or bytes, and had to be changed into characters before it could be analysed and worked on. Then the result must changed back before the server at the other end could understand it. Endless days spent stumbling over variants of this problem.

I learned two things of wider application then. First, that I can’t think in code. Only when I wrote out all my difficulties in English prose could I understand them well enough to solve them. The other is that the limited form of AI that Microsoft has built into its code editor is unnervingly useful. This is because bullshit is no use in code. Either a line works, or it does not. The computer will only do what you tell it; not what you would like it to think that you mean.

The Microsoft “copilot” it is an autocorrect informed by the millions of programs up on github. Unlike the autocorrect we are used to for human language, it is working over a very limited domain and it is really likely that the most common continuations in this huge corpus of code will actually be the right ones. If they weren’t, no one would publish them. So you are treated to the ghostly appearance of five or six lines of boilerplate code and need only press Tab to have them solidified. Very likely, but not certainly, they will work. I spent some time banging my head against an error that an automatic completion had introduced into the code. The compiler kept complaining about an error I knew I had not typed, but in the trance of tunnel vision where I then lived, I could not step out of that to realise that someone or some thing must have typed it instead.

I’m sure that some of this stupidity is due to age, and much of it to inexperience of the kind of problem I was dealing with. But most of the exhaustion and practical stupidity comes I now think from the difficulty of switching modes of cognition. Writing code is for me an immersion — a kind of involuntary concentration into a particular mode of thought quite distinct from the way I think in human language. And changing from the one mode to the other is involves a wrench or a derailment. perhaps an analogy with bricks is useful: the building blocks of code ideally resemble lego blocks: all of them fit into one another in the same clearly defined ways and all can tidily be swapped for other ones. But working with language is much more like building a dry stone wall. Which stones fit where can only be determined one stone at a time and the ideal wall, or the ideal sentence, would have every stone, each word, in the place where it alone can fit.

Writing prose is for me just as immersive, but it is an iterative process of sounding out sentences until they ring true. You only learn which stone fits with which others where by trying them out. You only see which ones don’t properly fit when you come back, look properly, and squirm.

The slow deep hover

Apologies for absence

Some thoughts on coding badly