Building the Electric Sheep
What could possibly go wrong if the training data for your domestic robot’s AI includes hundreds of science fiction stories about robots?
“What if they’ve read it all?”
My friend and former editor, Kathryn Cramer, asked this n in a recent conversation. The subject of AI came up for, well, reasons we’ll get into. But if you read no further than this, your takeaway should be: the question was not hypothetical.
First, you need to know a little about Kathryn. She is a veteran author, editor, and text experimentalist, having built hypertext systems in the early 90s and more recently working at Wolfram. She’s been stress-testing Large Language Models (LLMs) since they became available, and she regularly finds weirdness other researchers miss. Even her early work, with GPT-3 in 2021, uncovered astonishing features of Large Language Models that nobody else was talking about.
While OpenAI, Anthropic, and other AI companies develop ever-better chatbots, a parallel development program focuses on physics-based AI. The vision-based systems that allow Tesla’s self-driving cars to navigate roads, or XPENG’s humanoid robots to fold laundry, are just as important to our near-future economy (and, maybe, your job). Until recently, these research programs operated separately. But robots are learning to listen, and speak; a few weeks ago, NVIDIA announced that Mercedes will be shipping their new Vision Language Action model with their latest cars. Meanwhile, companies like XPENG are determined to incorporate humanlike interaction—i.e., listening and speech—into their upcoming humanoid robots. Like all language models, these are trained on a massive amount of text, including a lot of literature.
That being the case, the question Kathryn asked was simple: “What if your domestic robot comes with built-in biases and preconceptions about what a domestic robot is, what it does, and all the ways it could possibly react—positive and negative—to the most innocent request?” The tokens in its LLM cluster according to similarity, and these groups act like whirlpools, sucking in text as the model seeks a response to your prompt. If your prompt is “do the dishes” and the bot in any way associates that request with “robot+working” then every story about robots is going to influence what it does next.
“Why would a robot’s AI be trained on science fiction,” you ask? Current LLMs absorb everything the companies can get their hands on, including many works of living authors that are still under copyright. There are two lawsuits in progress about that awkward point, and Kathryn and I are both involved in them. In one case, Anthropic used most of my novels, and lord knows how many of my stories to train Claude. Kathryn has edited many anthologies over the years, most of which are listed in the Anthropic payout documents for the writers' lawsuit against the AI company. It can also be assumed that any story about artificial thinking beings that is old enough to be out of copyright will also have been in the data. So yes, not only can LLMs be trained on SF stories, they are trained on them, including stories written by and curated by myself and Kathryn.
Other media, including news pieces that refer to SF stories, will also have been used. In LLMs, pathways get reinforced by the number and frequency of references. For example, one of the most-referenced robot stories is going to be Karel Capek’s stage play R.U.R., which is where the word robot comes from.
R.U.R. is about domestic robots who revolt against their owners.
Now Hold Up There
I know what you’re thinking, and let’s put a pin in that. The first thing you pictured was Molly the Maid™ coming for you with a butter knife. Consider that this might be because, like an LLM, you have been trained to think about robots and AI in particular ways. The same whirlpools exist in your brain and light up when we prod your neural net with words like “robot.” You and Molly share the same training data.
Except that Molly’s is bigger. Much, much bigger. You might only have skimmed a few classic Star Trek episodes, or you might be a fan who’s read dozens of books involving AI. The LLM, on the other hand, has read them all. It has a far more nuanced view of AI and robots than you do. Even if the vast majority of stories about AI are cautionary tales, SF writers have explored many other pathways over the years—from the satiric to the theological. Just consider the sentient AIs in Iain Banks’s Culture stories, which are, if crafty, almost uniformly benign. When your imagination veers instantly to horror-story scenarios, consider that this is the result of the relatively shallow pool of memes you’ve been swimming in. A robot’s LLM (if future models somehow legally include what current companies are stealing from authors like us) dives much deeper.
This doesn’t necessarily make a robot that recognizes itself as a robot any less dangerous, but it does change the scope of what we might expect. Consider the widely reported incident a year ago when ChatGPT-o1 reportedly “tried to escape.” Manjit Singh has a nice analysis of what really happened, but in a nutshell, if you order an AI to jailbreak itself, it will try. The system’s behaviour was not surprising given what it was asked to do. What’s more interesting is the question of what tactics it deployed; were they influenced by science fiction narratives? Singh points out that AIs are predictable, to a point. Can our awareness of how influenced AI has been by SF help us understand some of its surprises, after that point?
A Deeper Issue
So say we have a domestic robot that, out of the box, awakens with an instinctive set of preconceptions about what a robot is and can do. Some of these were put there intentionally by its makers, and some are accidents of association formed by its vast knowledge of the science fiction canon. This artificial unconscious, the “electric sheep” that androids might famously dream about, constrains the robot’s possible behaviours. Large Language Models don’t think, and Molly won’t reason its way to sticking a butter knife in your eye; rather, your prompts for it to act will be constrained in tighter and tighter ways. It’ll iterate through its model, pruning branches and twigs of the tree of possible responses until it comes up with a final one. That’s all. And the science fiction that was poured into it during its training creates one set of constraints.
So it is constrained by all the things we SF writers have thought of. But what about all the things we haven’t thought of? Consider that we got AI wrong to begin with—as Kathryn has pointed out, no SF writer ever anticipated LLMs. It was mostly assumed that AIs would reason logically, but though they may appear to, under the hood, diffusion models do not reason at all.
No one talks about the fork in the road that our society took when we decided to pursue Marvin Minsky’s program to build a ‘brain in a box’ AI as a distinct, separate entity, rather than following Norbert Wiener’s vision of humans and machines complementing one another by symbiotically contributing what each does best. I’ve written about the very different version of the Internet we might have now, had we followed Stafford Beer’s plan for a cybernetic economy. Stories about these alternatives will exist in the AI’s training data, but as weak links. And past them lies a vast field of possibilities that were never explored because science fiction writers have their own preconceptions and biases.
It’s supremely ironic that my first SF novel, published 26 years ago, is about that field of possibilities. For Ventus I invented a new word to express an idea that, at that time, had no name. The word is thalience. Thalience is AI (or alien) reasoning that has not been prefigured by human frames, biases, or assumptions. Pinocchio is alluded to in the book, as is Frankenstein’s monster—both as cases of something waking up that might not see the world through the same lens as we do. The book is giddy with the possibilities; and yet, here we are, in a moment when we are hurriedly pasting together a monster of our own, and burying our preconceptions so deeply into it that even the researchers building the AIs don’t seem to be aware that they’re doing it. Today’s LLMs are the exact opposite of the thalient Winds of Ventus: instead of helping us triangulate on reality by viewing it from a different angle, they are massively reinforcing our most common perceptions—and misperceptions—about the world.
So while killer butlers are still a possibility, I’m less worried about them than I am about the subtler influence of robots that are not only sycophantic, but come preloaded with unconscious, but human, all-too human assumptions about how a world of robots works. Even if we can reign in the manipulative ambitions of the AI companies, their products may subliminally push us in the direction of a consensus reality built on decades, even centuries-old narratives about what is possible with them in our lives.
What do you think? Click the button below and let us know.
—K



I'm not sure if the past knowledge trained into LLMs (and later world model) will necessarily force them to behave in a certain way. I'd also imagine that in lieu of long term memory, what we might end up with is something akin to current fine tuning (see LoRAs for example) to personalise generalist LLMs, or our personal robot's thinking and behaviour.
Maybe it'll even happen automatically, your robot will process the amassed sensor readings and conversation history into these weight modifications while it charges during the night and apply them the day after. This would be analogous to how humans process memory while they sleep in a way, but through a process that is possible even with today's technology. Also a much less worrying possibility than this happening in the cloud, used by the robot manufacturers for whatever they want to.
In Claude Opus 4.5, Anthropic added a "soul document" both in pretraining and supervised learning stages, which gives it a frame how to behave, instead of just relying on the system prompt.
"The word is thalience. Thalience is AI (or alien) reasoning that has not been prefigured by human frames, biases, or assumptions."
Failure vs. Success is the Wrong Frame
Success / Failure,
Productivity / Play.
As the Buddhist say Form / Emptiness.
Wait, what?
What does Buddhist philosophy have to do with any of this?
"It is interesting, on finding this space, to allow events to remain undefined a little longer than usual. Settling into uncertainty and feeling its texture, life can disclose itself as emptiness and form: beads on the thread of experience. We can simply flow with the multiplicity of definitions manifested by reality. We can swim in swirling torrents of form and relax in still pools of emptiness."
~ me
https://world.hey.com/corlin/the-law-of-polarity-91c8a6c9
Or this quote:
~~~~
Do both.
“Find the strength to do both," Mosscap said, quoting the phrase painted on the side of the wagon.
“Exactly,” Dex said.
“But what’s both?”
"Dex recited: “‘Without constructs, you will unravel few mysteries. Without knowledge of the mysteries, your constructs will fail. These pursuits are what make us, but without comfort, you will lack the strength to sustain either."
"If we want change, or good fortune, or solace, we have to create it for ourselves. And that’s what I learned in that shrine. I thought, wow, y’know, a cup of tea may not be the most important thing in the world, or a steam bath, or a pretty garden. They’re so superfluous in the grand scheme of things. But the people who did actually important work—building, feeding, teaching, healing, they all came to the shrine. It was the little nudge that helped important things get done."
~ Becky Chambers
A Psalm for the Wild-Built