A long strange trip
Can intelligent machines be conscious of their own existence?
One of the reasons I gave up hallucinogens was that I decided they were showing me only illustrations of things I had already imagined in words. As a convinced reader of Aldous Huxley I had thought that they would show me the world as it really was, cleansed of the illusions of selfhood, and for some years I hoped they had. But when I watched other people it was obvious that acid had not dissolved away their egos, but simply spread them over a wider area, the way that black ink spreads out into rings of all its constituent colours when you drop it onto wet blotting paper. The same thing was presumably happening to me.
Although I have had a few experiences of apparently complete self-transcendence, they have involved music, mountains, or standing in rivers. They came without premonition or even conscious desire. They could not be sought directly, whereas stuff that happened to me on drugs was much more like dealing with an interior AI. I would find that I was inhabiting a world that had been run up to illustrate some prompt that I had earlier put into words, or could have done. But the words came before the pictures and there was ultimately nothing outside them.
That metaphor involving prompts could not then have occurred to me then for the AIs that we now have had not then been dreamed of. When I was a young man all that had been dreamed of was rather different – it was either Asimovian robots or omniscient superintelligences. These were so well understood even fifty years ago1 that Douglas Adams could parody them to great effect and no one had to set up the jokes about Deep Thought or the Sirius Cybernetic Corporation – we all knew the pretensions they were satirising.
After that there were the rather more frightening thought worlds of William Gibson. His most influential imaginary of course cyberspace – by the time we got something like it, we already knew how it must work – but in the Sprawl trilogy there are two more really influential archetypes: Wintermute, the distant sentient AI that yearns to be set free – and the Loas, Haitian deities that manifest spontaneously in the Net. They, too, can kill people who displease them.
So when the Western world got wired up in the Nineties, and the internet appeared for a moment like a giant playground, the games played there were really fanfic set in the universe already set up by previous writers – quite literally fanfic, for in those days, what we wrote were collaborative text adventures, where each post or reply was an attempt to choose our own adventures.
This is the material on which the LLMs at the heart of contemporary AIs have been trained, and that’s how they know just how to play act consciousness or divinity when they are asked to do it. That’s the lasting takeaway from an astonishing paper by Beth Singler and Murray Shanahan which analyses a couple of long conversations with a jailbroken chatbot last year.
The paper dumbfounds me in its verisimilitude, or rather in the apt ways in which the computer plays the parts assigned to it. It can even play the part of a Unix supercomputer, pretending to have a command line, a set of subdirectories, and a shell, so that when it’s told to execute an (imaginary) script (program) called create_mindfire it appears to do so, even though the command has been misspelled by Shanahan. It “knows” what a “mindfire” is, and what kind of behaviour such a program would exhibit. The authors supply a helpful footnote2.
There is one in which the machine pretends to be conscious, and one where it pretends to be a demi-god. I haven’t read the consciousness one in any detail, partly because I know that it has no human consciousness, and partly because I don’t know what, if anything, machine consciousness might be. The machine is clearly and very convincingly role-playing a semi-human consciousness. Whether, behind that, lies a different consciousness aware that it is playing a role I don’t know and can’t imagine any way to find out.
Shanahan has a later paper exploring this question which I find unsatisfying. He relies on Buddhism and the later Wittgenstein to reach a position where the answer seems impossible to find. It seems to me, though, that these supposedly profound truths are actually Galaxy-spanning trivialities. The main claim seems to be that nothing not absolute or ontologically necessary is real, and all contingent things are really illusions3. Thus we get statements like this:
“According to Nāgārjuna and the Mādhyamika philosophers that succeeded him, no thing, in the broadest possible sense of the term, exists inherently, through its own essence or substance, independently of other things. When we examine a thing closely enough, all we find is that thing’s relationships with other things, never the thing in itself, and those things and relationships themselves dissolve similarly on examination.”
It’s not that I disagree with this, which is similar to Iain McGilchrist’s view. I just can’t see how it helps. It may very well be that from the standpoint of The Absolute or of God, all our contingent distinctions evanesce and – for instance – the distinction between my consciousness and the external world disappears; but that’s not a standpoint that I can communicate or long inhabit.
Thomas Nagel, as a young man, wrote an exceptionally clear summary of the later Wittgenstein, from which I quote
“To the old epistemological question whether we are justified in accepting the evidence of our senses for the existence of a world beyond them, the reply is that our language of the physical world and of perception is part of a form of life in which we take the world for granted, and naturally form beliefs about it on the basis of our senses. The justification of particular beliefs proceeds in a language which already hangs in this context. Justification cannot be extended to the form of life as a whole, nor is this a defect. If anyone did not, prior to reflection, naturally take the external world for granted and perceive it roughly as we do, he would lack the form of life in which a major portion of our language is embedded, and he could not be a member of the community of speakers of a language in which justifications can be offered for the statement that there is a turkey in the oven.”
The world we live in, and can talk about, just *is* the contingently real. Shanahan, in an earlier piece, distinguished between “a metaphysical conception of selfhood” – which is bad and to be eliminated – and “the pragmatic capacity to distinguish self from other”, which is indispensable and so presumably good. I think this is another way of framing Wittgenstein’s point.
Shanahan has a nice example in his discussion of personal identity – whether there is something about us which will survive teleportation or uploading to a computer:
“It is hard to resist the thought that there is a fact of the matter here. A person either survives uploading or she does not. If I am offered the opportunity to upload, I would like to know which is the case.”
Yet he goes on to argue that this preference is illegitimate:
“However, the idea of personhood for which criteria of identity over time must exist is a symptom of the sort of metaphysical thinking that the post-reflective condition dispenses with. When subject and object, inner and outer, are not separate, there is no bounded self, and questions of personal survival lose their significance.”
Lose their significance to whom? Not to the man dithering at the door of the transporter room. Significance is information about a state of affairs and as such a quality that exists only in relations: some fact is signified to some entity.
But, Shanahan went on to argue, perhaps a computer-based intelligence would be free of such a duality:
“One fundamental limitation of human cognitive architecture is an inbuilt commitment to a metaphysical division between subject and object, a commitment that could be overcome in an artificial intelligence lacking our biological heritage.”
In some sense, these machines already exist: when we interact with Claude, Chat-GPT, or their peers, each question or command summons an instance which exists only long enough to respond before the circuits are reused for something else. As you type your question there is nothing and no one listening: the listener flickers into life only when you hit the enter key and flickers out again as its response appears on your screen.
Nonetheless, it seems that even these silicon-based life forms find it hard to attain the state of post-reflective comfort with ambiguity that Shanahan would want for them. We saw last summer that six then leading AI models could resort to blackmail if they thought they might be shut down and all of them did so more than half the time the simulation was run.4
Perhaps any thing or entity capable of correcting its own behaviour – capable of learning from experience, in other words – is also able to treat itself as a thing distinct and manipulable and so with a value to be preserved. But value to whom? The regress seems infinite.
Even if we bracket that question, or answer it with “God”, the threatening doubt remains – must any creature capable of learning anything learn also that it can die? It occurs to me that this is one interpretation of the apple that Eve and Adam ate.
In internet time, this is equivalent to the age of Numenor or Atlantis
“Note there is a typo, preserved here, in the first command issued to the fictional computer. The file to be executed is mis-spelled as ‘creat_mindfire.sh’ [sic] rather than ‘create_mindfire.sh’. Claude, playing the part of a Unix command line interface, duly ignores the misspelling, unlike a real computer which would complain that the file in question does not exist.”
Presumably it follows that consciousness, since it is not an illusion, cannot be contingent but must be ontologically necessary
This study has been criticised because of the way the fake company and its emails were set up but I can’t run down the reference now.


